I’ve already been in contact with my webhosting and they’ve been, somewhat, less than helpful, so I’ve come to the geniuses here.
I’m unable to rewrite any of my URLs on my website, via the .htaccess file.
I only have one .htaccess file, which is in the root of my home directory. Here is that file:
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
# The support guys thought it was in issue with the L flag below, so I commented out my original implementation and used their supplied one
#RewriteCond %{HTTP_HOST} ^www.mythofechelon.co.uk$ [NC]
#RewriteRule ^(.*)$ http://mythofechelon.co.uk/$1 [R=301,L]
RewriteCond %{HTTP_HOST} ^www\.(([a-z0-9_]+\.)?mythofechelon\.co.uk)$ [NC]
RewriteRule .? http://%1%{REQUEST_URI} [R=301,L]
RewriteRule ^/$ /main/pages/index.php?home
RewriteRule ^/home(.*)?$ /main/pages/index.php?home
RewriteRule ^/404(.*)?$ /main/pages/index.php?404
#I will eventually change the following commands to link to the rewritten URLs when this all eventually works
DirectoryIndex /main/pages/index.php?home
ErrorDocument 404 /main/pages/index.php?404
AddType application/x-shockwave-flash swf
Options All -Indexes
#Protect .htaccess
<files .htaccess>
order allow,deny
deny from all
</files>
<Files *.reg>
ForceType application/pdf
Header set Content-Disposition attachment
</Files>
#Block bots
<limit GET POST HEAD>
Order Allow,Deny
Allow from all
Deny from env=bad_bot
</limit>
RewriteRule ^.* - [F,L]
RewriteCond %{HTTP_USER_AGENT} ^Anarchie [OR]
RewriteCond %{HTTP_USER_AGENT} ^ASPSeek [OR]
RewriteCond %{HTTP_USER_AGENT} ^attach [OR]
RewriteCond %{HTTP_USER_AGENT} ^autoemailspider [OR]
RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR]
RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:craftbot@yahoo.com [OR]
RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR]
RewriteCond %{HTTP_USER_AGENT} ^Custo [OR]
RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR]
RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [OR]
RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR]
RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR]
RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [OR]
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR]
RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR]
RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR]
RewriteCond %{HTTP_USER_AGENT} ^HMView [OR]
RewriteCond %{HTTP_USER_AGENT} HTTrack [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} Indy\ Library [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR]
RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [OR]
RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR]
RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [OR]
RewriteCond %{HTTP_USER_AGENT} ^larbin [OR]
RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [OR]
RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [OR]
RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR]
RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetZIP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [OR]
RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [OR]
RewriteCond %{HTTP_USER_AGENT} ^pavuk [OR]
RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR]
RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR]
RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR]
RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR]
RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [OR]
RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebGo\ IS [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website\ Quester [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebZIP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Wget [OR]
RewriteCond %{HTTP_USER_AGENT} ^Widow [OR]
RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [OR]
RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Xenu [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus.*Webster [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus
SetEnvIfNoCase user-Agent ^FrontPage [NC,OR]
SetEnvIfNoCase user-Agent ^Java.* [NC,OR]
SetEnvIfNoCase user-Agent ^Microsoft.URL [NC,OR]
SetEnvIfNoCase user-Agent ^MSFrontPage [NC,OR]
SetEnvIfNoCase user-Agent ^Offline.Explorer [NC,OR]
SetEnvIfNoCase user-Agent ^[Ww]eb[Bb]andit [NC,OR]
SetEnvIfNoCase user-Agent ^Zeus [NC]
(Man, you guys need to change it so that you don’t have to manually indent every single line.)
I have independently confirmed that:
- The
DirectoryIndexcommand is working, asDirectoryIndex /main/pages/index.phpworks. - The
ErrorDocument 404command is working, obviously. - Linking to files using PHP variables works, as the current implementations of the
DirectoryIndexandErrorDocument 404commands work. - It is not an issue with any of the file-protecting or bot-blocking commands.
- It is not an issue with the “www.” removing commands, as I have commented out and completely removed all attempted implementations of them and still had the same issues.
The issue lies, seemingly, entirely with the RewriteRule commands. RewriteEngine is enabled, at least in the .htaccess and mod_rewrite was working a few days ago, before I restarted my site.
I’m thinking that it may be because the RewriteRules have no RewriteConds, but these exact commands were working a few days ago.
In the .htaccess you posted above, there is no RewriteRule immediately following these rules:
So they will be combined with the next uncommented rules which does the redirect but does not define what file should handle these requests:
You’ll want something like this:
Order of rules is essential with mod_rewrite. Another example in your file where things are out of order is the section where you’re trying to block bad bots. The
RewriteRulemust come after theRewriteCondrules. Also, your limit section doesn’t actually do anything since none of the rules about the bots actually set the environment variable.There’s actually another directive you can use specifically for looking at user agents and setting environment variables:
BrowserMatchandBrowserMatchNoCase– http://httpd.apache.org/docs/2.2/mod/mod_setenvif.html#browsermatchnocase.I’d replace the lines for the bad bots with something like this:
Then move your limit section below the
BrowserMatchNoCaseentries — otherwise the environment variable may not be set yet.Also, mod_rewrite flags are not valid with
SetEnvIfNoCaseentries.Update
To handle the 404s you could either add the following:
Or (and this is what I would suggest) you could change home to .* and then update your php script to send the 404 when appropriate.