I’m trying to scrape a page but the initial response has nothing in the

Question

0

Asked: May 15, 20262026-05-15T12:30:57+00:00 2026-05-15T12:30:57+00:00

I’m trying to scrape a page but the initial response has nothing in the

0

I’m trying to scrape a page but the initial response has nothing in the body as the content is pumped in asynchronously, e.g. the results from a search on the apple website: http://www.apple.com/uk/search/?q=searching+for+something&sec=global

Any ideas on how I can successfully grab the results from the search with hpricot?

Thanks.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-15T12:30:57+00:00

When the search page you refer to is loaded, it makes a request via javascript/ajax to some other location, then populates the search results. This is what you’re seeing in the page. Hpricot itself can’t help you here because it has no way to interpret the javascript that comes with the page in order to fetch the actual search results list.

Now, if what you’re interested in are the search results, you’d need to analyze a bit what happens when you enter that page and type a search query. Some javascript in the page takes your query, and calls (via XMLHttpRequest or similar, AJAX techniques) some other script in Apple’s server. This is the one that actually does the search in a database and returns the result.

I suggest you install Firefox with the Firebug plugin, or some other way of seeing the actual requests a page and its javascript components send and / or receive. You’ll see that, for the search page you referred, it fetches two parts: First, the “featured” results that come from this URL:

http://www.apple.com/global/scripts/search_featured.php?q=mac+mini&section=global&geo=uk

Notice the search string is in the “q” parameter.

Second, a long results list comes from here:

http://www.apple.com/search/service/nph-search10?site=uk_www&filter=1&snum=50&q=mac+mini

These both are XML documents; you might have better luck parsing these URLs with Hpricot.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying to scrape a page but the initial response has nothing in the

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply