We’re adding some functionality to our CMS whereby when a user creates a page, they can select an option to allow/disallow search engine indexing of that page.
If they select yes, then something like the following would apply:
<cfif request.variables.indexable eq 0>
<cffile
action = "append"
file = "C:\websites\robots.txt"
output = "Disallow: /blocked-page.cfm"
addNewLine = "yes">
<cfelse>
<!-- check if page already disallowed in robots.txt and remove line if it does --->
</cfif>
It’s the <cfelse> clause I need help with.
What would be the best way to parse robots.txt to see if this page had already been disallowed? Would it be a cffile action=”read”, then do a find() on the read variable?
Actually, the check on whether the page has already been disallowed would probably go further up, to avoid double-adding.
You keep the list of pages in database and each page record has a
indexablebit, right? If yes, simpler and more reliable approach would be to generate new robots.txt each time some page is added/deleted/changes indexable bit.Sample code is not tested, be aware of typos/bugs.