I would use the hosting for live testing, but I want to protect access and prevent search engine indexing.
For example (server directory structure) within public_html:
_private
_bin
_cnf
_log
_ … (more default directories hosting)
testpublic
css
images
index.html
I want index.html is visibile to everyone and all other directories (except “testpublic”) are hidden, protected access and search engines not to index.
The directory “testpublic” I wish it was public but may not be indexed in search engines, not sure if this is possible.
To do understand that I need 2 files .htaccess.
One general in “public_html” and other specific for “testpublic”.
The .htaccess general (public_html) I think it should be something like:
AuthUserFile /home/folder../.htpasswd
AuthName “test!”
AuthType Basic
require user admin123
< FilesMatch “index.html”>
Satisfy Any
< / FilesMatch>
Can anyone help me create the files with the appropriate properties? Thank you!
You can use a robots.txt file in your root folder. All standards-abiding robots will obey this file and not index your files and folders.
Example Robots.txt that tells all (*) crawlers to move on and index nothing.
You could use .htaccess files to fine tune what your server (assuming Apache) serves out and what directory indexes are visible. In which case you would add
To your .htaccess file to disallow indexes.
Updated (Credit to https://stackoverflow.com/users/1714715/samuel-cook):
If you want to specifically stop a bot/crawler and know its USER AGENT string you can do so in your
.htaccessHope this helps.