I am currently working with a client to redevelop their website. One of the final things I need to do before launch, is to make sure that their old website’s pages are correctly redirected to the new URL structure of the new website.
Unfortunately, when I check Google to see how their current site is indexed, this relatively small website appears to have over 1500 pages indexed.
When I look at the indexed links on Google, many appear to be duplicates of the same page, but because of the terrible URI structure used on the old website, Google treats them differently.
For example, the ‘Map’ page is indexed at least twice on Google, under the following 2 URLs:
www.website.com/frame_page-map.html?mp_session=iris7k85851j05q55piqci31u3&mp_session=iris7k85851j05q55piqci31u3?page_code=map&mp_session=iris7k85851j05q55piqci31u3&mp_session=iris7k85851j05q55piqci31u3
www.website.com/frame_page-map.html?mp_session=sel6m8j5cu8lulep4dqa32sne7&mp_session=sel6m8j5cu8lulep4dqa32sne7?page_code=map&mp_session=sel6m8j5cu8lulep4dqa32sne7&mp_session=sel6m8j5cu8lulep4dqa32sne7
Only the session name is different in the URL (and I have no idea why it is repeated four times in a single URL, either).
For reference, the replacement URL for this page is:
http://www.website.com/contact/map
My question is: How do I setup a redirect for these multiple records on Google? Do I simply set-up the redirect for the old URL minus all of the URI parameters (i.e. http://www.website.com/frame_page-map.html) or is there another better method to do this?
Thanks for any help you might be able to offer!
It depends on what your goals are. If you don’t care about the querystrings then setup a 301 (permanent redirect) that points to just your root page – map.html. To prevent google from indexing querystring params as separate pages use the canonical tag and have it reference the parent. This isn’t guaranteed to work, but google takes your canonical into consideration when indexing.
If you care about the querystring values then you will have to setup a redirect for each one. There is a querystring parameter that you can append to your redirects that will tell it to be ignored so you don’t have to write a regex that detects it.