I’m trying to create a controller for my sitemap, but only allow search engines

Question

0

Asked: May 16, 20262026-05-16T10:28:25+00:00 2026-05-16T10:28:25+00:00

I’m trying to create a controller for my sitemap, but only allow search engines

0

I’m trying to create a controller for my sitemap, but only allow search engines to view it.

If you look at https://stackoverflow.com/robots.txt you’ll see that their sitemap is https://stackoverflow.com/sitemap.xml. If you try to visit the sitemap, you’ll be redirected to the 404 page.

This meta question confirms this behavior (answered by Jeff himself).

Now I don’t want this question closed as “belongs on Meta”, as I’m just using StackOverflow as an example. What I really need answered is…

How can I block all visitors to a controller EXCEPT for search bots?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-16T10:28:25+00:00

You can probably create a filter attribute that rejects the request using the User Agent header. The usefulness of this is questionable(and is not a security feature) as the header can be easily faked, but it will stop people doing it in a stock browser.

This page contains a list of user agent strings that googlebot uses.

Sample code to redirect non-googlebots to a 404 action on an error controller:

[AttributeUsage(AttributeTargets.Method, AllowMultiple = false)]
public class BotRestrictAttribute : ActionFilterAttribute {

    public override void OnActionExecuting(ActionExecutingContext c) {
      if (c.RequestContext.HttpContext.Request.UserAgent != "Googlebot/2.1 (+http://www.googlebot.com/bot.html)") {
        c.Result = RedirectToRouteResult("error", new System.Web.Routing.RouteValueDictionary(new {action = "NotFound", controller = "Error"}));
      }
    }
}

EDIT To respond to comments. If server load is an issue for your sitemap, restricting access to the bots might not be sufficient. Googlebot by itself has the ability to grind your server to a halt if it decides to scrape aggressively. You should probably cache the response as well. You can use the same FilterAttribute and Application.Cache for that.

Here is a very rough example, might need tweaking with propert HTTP headers:

[AttributeUsage(AttributeTargets.Method, AllowMultiple = false)]
public class BotRestrictAttribute : ActionFilterAttribute {

    public const string SitemapKey = "sitemap";

    public override void OnActionExecuting(ActionExecutingContext c) {
      if (c.RequestContext.HttpContext.Request.UserAgent != "Googlebot/2.1 (+http://www.googlebot.com/bot.html)") {
        c.Result = RedirectToRouteResult("error", new System.Web.Routing.RouteValueDictionary(new {action = "NotFound", controller = "Error"}));
        return;
      }

      var sitemap = Application.Cache[SitemapKey];
      if (sitemap != null) {
        c.Result = new ContentResult { Content = sitemap};
        c.HttpContext.Response.ContentType = "application/xml";
      }

    }
}

//In the sitemap action method
string sitemapString = GetSitemap();
HttpContext.Current.Cache.Add(
 BotRestrictAttribute.SitemapKey, //cache key
 sitemapString, //data
 null, //No dependencies
 DateTime.Now.AddMinutes(1), 
 Cache.NoSlidingExpiration, 
 CacheItemPriority.Low, 
 null //no callback
);

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying to create a controller for my sitemap, but only allow search engines

How can I block all visitors to a controller EXCEPT for search bots?

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply