I’m trying to extract each a href link on an html page for evaluation

Question

0

Asked: May 16, 20262026-05-16T01:49:13+00:00 2026-05-16T01:49:13+00:00

I’m trying to extract each a href link on an html page for evaluation

0

I’m trying to extract each a href link on an html page for evaluation w/ nokogiri and xpath. What I have so far seems to be pulling the page titles out only. I’m not interested in the link title, but rather just the URL that is being pointed to.

Here’s what I have:

doc = Nokogiri::HTML(open("http://www.cnn.com"))
doc.xpath('//a').each do |node|
  puts node.text
end

Can anyone guide me on how to correct this so that I’m pulling the actual href instead of the text itself?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-16T01:49:13+00:00

Editorial Team

2026-05-16T01:49:13+00:00Added an answer on May 16, 2026 at 1:49 am

Your XPATH of //a is pulling back all elements. Which includes the text content. You can use @attrname to access attributes. For example

//a/@href

Will get you the href of every a in the document

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying to extract each a href link on an html page for evaluation

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply