I’ve got several pages on my ASP.NET MVC 3 website (not that the technology matters here), where i render out certain URL’s in a <script> tag on the page, so that my JavaScript (stored in an external file) can perform AJAX calls to the server.
Something like this:
<html>
...
<body>
...
<script type="text/javascript">
$(function() {
myapp.paths.someUrl = '/blah/foo'; // not hardcoded in reality, but N/A here
});
</script>
</body>
</html>
Now on the server-side, most of these URL’s are protected with attributes stating that:
a) They can only be accessed by AJAX (e.g XmlHttpRequest)
b) They can only be accessed by HTTP POST (as it returns JSON – security)
The problem is, for some reason, bots are crawling these URL’s, and trying to do HTTP GET’s on them, resulting in 404’s.
I was under the impression that bots shouldn’t try and crawl javascript. So how are they getting a hold of these URL’s?
Is there any way i can prevent them from doing this?
I can’t really move these URL variables to an external file, because as the comment in the code above suggests, i render the URL’s out with server-code (must be done on the actual page).
I’ve basically been added routing to my website to HTTP 410 (Gone) these URL’s (when it’s not a AJAX POST). Which is really annoying, because it’s adding another route to my already convuluted route table.
Any tips/suggestions?
Disallow URL by the prefix in the robots.txt