I was fixing url’s on a website, and one of the problems there was that the url’s contained characters that were sometimes upper-case while other times lower-case, the server did not care about it, but google did, and indexed the pages as duplicates.
Also some urls contained characters that are simply not allowed to be in that part of the URL, like commas “,” and brackets “()” although [round brackets are technically not reserved][1]
I still decided to get rid of them by encoding them.
I added a check that checks if the url is valid, and if not, would do a 301 redirect to the correct url.
for example
http://www.example.com/articles/SomeGreatArticle(2012).html
would do a 301 redirect to
http://www.example.com/articles/somegreatarticle%282012%29.html
It works, and it does one redirect to the correct url.
But for a small fraction of the pages (which are possibly the only pages google has indexed so far) google webmaster tools started to give me the following error under the Crawl errors > Not followed tab:
Google couldn’t follow your URL because it redirected too many
times.
googling for this error with quotes gives me 0 results, and I’m sure I’m not the only one to ever get this error, so I would like to know some more information about it, for example:
- how many redirects can a single page do before google thinks that it’s too many?
- what are the other possible causes for such an error?
SOLUTION
According to this experiment http://www.monperrus.net/martin/google+url+encoding
Google has it’s own character encoding rules, where google will always encode some characters and always decode other.
The following characters are never encoded
So even if you give Google this url
where the round brackets () are encoded, google will transform this URL, decode the brackets and follow this URL instead:
What happened in my situation:
my server would do a 301 redirect to
while Googlebot would ignore the encoded brackets and follow:
get redirected to
follow
get redirected to
and give up after a couple of tries and show the “Google couldn’t follow your URL because it redirected too many times” error.