I have a page using Ajax and I want to make it crawlable for SEO. Checking Google’s specification(https://developers.google.com/webmasters/ajax-crawling), it says need to use “#!” to build a “pretty url”, and build a html snapshot for crawler engine. So how to create the html snapshot in c#?
Share
Though I’m not using it in production yet, I’ve found that PhantomJs (a webkit based headless browser) is quite up to the task. I wrote a Post on the subject .
After the DOM finishes loading and the Ajax requests finished, I just copy the whole dom, Phantom is also scripted using JavaScript, its very easy to get the DOM contents as HTML.
This is not a C# specific solution, but the interface is trivial and PhantomJS also runs on windows. Whenever I get requests with the escaped_fragment in the URL, a matching MVC route redirects the Crawler to the cached snapshot .