I am brand new to Ruby and Xpath. I need to extract the System features from the table at
So far I have tried targeting all of the td tags, the page doesn’t use CSS ids so I cant target that way.
I tried the following code
doc.xpath('//tr/th/span[normalize-space(text())="System features"]/..')
but it returns nothing ;(
Does anyone have any idea the best way to approach this?
That expression should work fine on the given source, but it’s not really idiomatic. You probably want to use something more like this:
normalize-spaceexpects a string argument. Passing the node-set returned bytext()forces conversion to a string by taking the first text node in document order. This doesn’t really matter in your document, because there’s only one child text node, but you should be aware that this is what’s happening./..at the end of the expression. You can test for the presence of the childspanusing a nested predicate and thereby select the desiredthdirectly.If you want to exploit the fact that the target
thcontains only the one childspannode, you could write this simplified expression:So, why isn’t working? Hard to tell, but it could be because the tool you’re using to parse the document is creating a structure that differs from how it appears in the literal source (e.g. because the input isn’t really well-formed XML). Try a slightly different expression:
Or maybe you should first verify that you can retrieve the
spanitself, then build the expression up from that: