I have a website (Ex: http://www.examplesite.com), and I am creating another site as a separate, stand-alone site in IIS. This second site’s URL will make it look like it’s part of my main site: http://www.examplesite.com/anothersite. This is accomplished by creating a virtual directory under my main site that points to the second site.
I am allowing my main site (www.examplesite.com) to be indexed in search engines, but I do not want my second, virtual directory site to be seen by search engines. Can I allow my second site to have its own robots.txt file, and disallow all pages for that site there? Or do I need to modify my main site’s robots.txt file and tell it to disallow the virtual directory?
You can’t have an own robots.txt for directories. Only a “host” can have it’s own robots.txt: example.com, http://www.example.com, sub.example.com, sub.sub.example.com, …
So if you want to set rules for
www.example.com/anothersite, you have to use the robots.txt atwww.example.com/robots.txt.If you want to block all pages of the sub-site, simply add:
This will block all URL paths that start with “anothersite”. E.g. these links are all blocked then:
www.example.com/anothersitewww.example.com/anothersite.htmlwww.example.com/anothersitefoobarwww.example.com/anothersite/foobarwww.example.com/anothersite/foo/bar/Note: If your robots.txt already contains
User-agent: *, you’d have to add theDisallowline in this block instead of adding a new block (bots will stop reading the robots.txt as soon as they found a block that matches for them).