I am trying to get the ASIN number from amazon html page using nokogiri

Question

0

Asked: May 21, 20262026-05-21T06:59:37+00:00 2026-05-21T06:59:37+00:00

I am trying to get the ASIN number from amazon html page using nokogiri

0

I am trying to get the ASIN number from amazon html page using nokogiri but I am having no luck using xpath. I have tried it with firepath and I am still getting nothing. Would it be better to just get the URL and then run a ruby REGEX to get the ASIN out? If so how would the regex look like?

#!/usr/bin/env ruby -w
require 'nokogiri'
require 'open-uri'
url = "http://www.amazon.com/gp/new-releases/books/3839/ref=zg_bsnr_nav"
doc = Nokogiri::HTML(open(url))

puts doc.xpath('//zg_list').each do | node|
  p node['asin']
end

This is what I have when it prints out the url.

#!/usr/bin/env ruby -w
require 'nokogiri'
require 'open-uri'
url = "http://www.amazon.com/gp/new-releases/books/3839/ref=zg_bsnr_nav"
doc = Nokogiri::HTML(open(url))

l = doc.css('div.zg_image a').map { |link| 
  link['href'] 
  }
puts l # => /Introducing-ZBrush-4-Eric-Keller/dp/0470527641/ref=zg_bsnr_3839_20/183-0702383-0095048

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-21T06:59:38+00:00

Editorial Team

2026-05-21T06:59:38+00:00Added an answer on May 21, 2026 at 6:59 am

For me the css method in Nokogiri is much easier to work with than XPath. Given the HTML at the URL you posted, the following should retrieve the “asin” property for each item:

doc.css("div.zg_item").map { |e| e["asin"] }

I think the correct XPath would be something like:

doc.xpath("//div[contains(@class, 'zg_item') and @asin]")

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am trying to get the ASIN number from amazon html page using nokogiri

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply