I’m trying to parse a webpage using open-uri + hpricot but it seems to

Question

0

Asked: May 12, 20262026-05-12T10:13:00+00:00 2026-05-12T10:13:00+00:00

I’m trying to parse a webpage using open-uri + hpricot but it seems to

0

I’m trying to parse a webpage using open-uri + hpricot but it seems to be a problem in the parsing proccess as the gems don’t bring me the things I want.

Specifically I want to get this div (whose id is ‘pasajes’) in this url:

http://www.despegar.com.ar

I write this code:

require 'nokogiri'
require 'hpricot'
require 'open-uri'

document = Hpricot(open('http://www.despegar.com.ar/')) # WITH HPRICOT
document2 = Nokogiri::HTML(open('http://www.despegar.com.ar/')) # WITH NOKOGIRI

pasajes = document.search("//div[@id='pasajes']")
pasajes2 = document2.xpath("//div[@id='pasajes']")

But it bring NOTHING! I’ve tried lot of things in both hpricot and nokogiri:

I try giving the absolute path to that div
I try CSS path with selectors
I try with hpricot search shortcut (doc//”div#pasajes”)
Almost every posible relative path to reach the ‘pasajes’ div

Finally i found a horrible solution. I have used the watir library and after open a web browser, i have passed the html to hpricot. In this way hpricot DO RECOGNIZE the ‘pasajes’ div. But i don’t want just to open a web-browsere only for parsing purposes…

What I’m doing wrong? Is open-uri working bad? Is hpricot?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-12T10:13:01+00:00

Editorial Team

2026-05-12T10:13:01+00:00Added an answer on May 12, 2026 at 10:13 am

There’s no DIV with the id pasajes in the static HTML page. If you are running *nix you can see that by doing:

curl http://www.despegar.com.ar/ | grep pasajes

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying to parse a webpage using open-uri + hpricot but it seems to

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply