We have a strange issue where Google’s webmaster tools is alerting us to some rather strange URLs that it is attempting to spider. We don’t have a clue where these are being generated and would like to understand how they were ever created. These are of the form:
http://test.xyz/(F(pC3Rt9u0AmQepAH-uY341LtzYEKrdkOOEgB9nDyfdDB6X9uL__MmT7S-euwfO_yPKKz8gdhnBhv8v1aGkuRj6G61sSaQ7mo1F8-PI32-pZxh9UJjogZk9Shvp7jdTaFLGHLEEw0_TtMNfvgoNMg6iQhOenxOisfvYc0BfbtxM53ksFvR0))/funk-radio
Usually this would take the form:
but asp.net MVC seems to deal with the garbage (by ignoring it?) and it appears that the (first) request gets routed correctly. In fact it seems that all routes work correctly when this garbage is added. For instance:
http://test.xyz/some/other/page
is equally reachable via:
http://test.xyz/(F(pC3Rt9u0AmQepAH-uY341LtzYEKrdkOOEgB9nDyfdDB6X9uL__MmT7S-euwfO_yPKKz8gdhnBhv8v1aGkuRj6G61sSaQ7mo1F8-PI32-pZxh9UJjogZk9Shvp7jdTaFLGHLEEw0_TtMNfvgoNMg6iQhOenxOisfvYc0BfbtxM53ksFvR0))/some/other/page
This implies to me that the garbage has some sort of special meaning. Can anyone enlighten me as to what’s going on here. These routes should not match, yet clearly they do. What’s going on?
EDIT
We’re not using cookiless auth. Here’s the relevant part of the web.config:
<authentication mode="Forms">
<forms loginUrl="~/Account/LogOn" defaultUrl="~" slidingExpiration="true"
timeout="10080"/>
</authentication>
Interestingly, if I connect to the site with the standard user-agent:
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:5.0) Gecko/20100101 Firefox/5.0,
using one of these garbage filled URLs then the request is honoured.
If on the other hand, I connect using the googlebot UA
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
I get a 400 response.
Looking at the urls the first things that come to mind is that your site is using authentication without cookies. When I worked with ASP.NET, for a project we used authentication cookiless and the framework put that token before every url.