I am trying to extract text out of a html. doc = Nokogiri::HTML(‘ <A

Question

0

Editorial Team

Asked: May 27, 20262026-05-27T02:46:35+00:00 2026-05-27T02:46:35+00:00

I am trying to extract text out of a html. doc = Nokogiri::HTML(‘ <A

0

I am trying to extract text out of a html.

doc = Nokogiri::HTML(' <A href="http://www.asl.com/foo/bar"> Status :</A> REGISTERED ')

puts doc.search('//b').first.text
puts doc.search('//b[contains(text(),"Status")]/following-sibling::text()[1]').first.text

the first puts returns Status :
But the second puts throws an exception undefined method 'text' for nil:NilClass

Why the contains doesn’t search properly ?
or am I doing something wrong ?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-27T02:46:35+00:00

I think you have the wrong idea of the text function in XPath. Unlike the DOM function it does not return a concatenated string of all text sub-nodes. Instead it selects individual text nodes.

In your example //text() would select three text nodes:

 [" ", " Status :", " REGISTERED "]

What you might want is this XPath expression:

//b/a[contains(text(),"Status")]/../following-sibling::text()[1]

Essentially it finds the a element having the correct text node, than walks up to the parent element (b) and then gets its sibling text node.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am trying to extract text out of a html. doc = Nokogiri::HTML(‘<B> <A

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply