We’re seeing some really weird URLs in our logs and I’ve been told to start redirecting them.
I know of a couple of better ways to go about fixing this, but the boss wants it done this way. I apologize in advance.
We’re seeing stuff like the following in our logs:
I’ve been told to ‘toss some mod_rewrite rules in the .htaccess file’ to take this and strip out all the ob, rpp, and ppg variables.
Now, I’ve found ways to strip everything out. And that wouldn’t be too bad if I could leave the /foo/bar/bla in there. But I can’t seem to do that. Basically, any help would be appreciated.
Try:
The problem here is that your URL:
http://www.example.com/foo/bar/bla&ob=&ppg=&rpp=100&ob=&rpp=&ppg=&rpp=30&ppg=&ppg=1&rpp=10&rpp=50&ob=&ob=&ob=&rpp=40&ob=&rpp=5&rpp=30&rpp=&rpp=20&order_by=&results_per_pge=75has A LOT of ob=, rpp=, and ppg= in the URI. More than 10. That means you’ll get a 500 internal server error if you use these rules against that URL. By default, apache has the internal recursion limit set to 10, that means if it needs to loop more than 10 times (and it will for the above URL), it’ll bail and return a 500. You need to set that higher:
or some other sane number. Unfortunately, you can’t use that directive in an htaccess file, you’ll need to go into server or vhost config and set it.