I am trying to extract text out of a html.
doc = Nokogiri::HTML('<B> <A href="http://www.asl.com/foo/bar"> Status :</A></B> REGISTERED <BR>')
puts doc.search('//b').first.text
puts doc.search('//b[contains(text(),"Status")]/following-sibling::text()[1]').first.text
the first puts returns Status :
But the second puts throws an exception undefined method 'text' for nil:NilClass
Why the contains doesn’t search properly ?
or am I doing something wrong ?
I think you have the wrong idea of the
textfunction in XPath. Unlike the DOM function it does not return a concatenated string of all text sub-nodes. Instead it selects individual text nodes.In your example
//text()would select three text nodes:What you might want is this XPath expression:
Essentially it finds the
aelement having the correct text node, than walks up to the parent element (b) and then gets its sibling text node.