I’m working on this html snippet:
<p class="pageSelector">
<a href="/BlaBla">< Prev</a>
<a href="/BlaBla">1</a>
<a href="/BlaBla">2</a>
<a href="/BlaBla">3</a>
4
<a href="/BlaBla">5</a>
<a href="/BlaBla">6</a>
<a href="/BlaBla">Next ></a>
</p>
rendered (more or less) as < Prev 1 2 3 4 5 6 Next > .
I want to select the “4” because I need to discover the ‘current’ page. Using
//p[@class='pageSelector']/text()[normalize-space()]
(tested with Firefox XPath Ckecker) I thougth I’d solved but no, because I obtained 7 matches.
Anyone please could tell me where I’m wrong?
Thank you
normalize-space removes whitespace, but the no-break-space character (despite its visual appearance) is not considered to be whitespace for this purpose. So I would do
which will return you those child text nodes that contain a character other than whitespace or no-break-space; you may then need to process this further to extract the part of the content you want.