I would like to get all the text nodes of a document, but only those that are NOT part of a hyperlink.
Test sample:
Hello <a class='foobar' href='foo.html'>foo</a>World Hello foo World
The result text nodes should include the text node with Hello foo World, but not the hyperlink.
I tried "//*[not(@href)]/text()" , but this does not appear to work.
UPDATE
As my answer below (hopefully) explains, my problem was that the query was looking for nodes inside the root node, but not the root node itself. My answer is below.
Andrew came up with a different approach that is probably more clear as to intent.
you can also exclude parents (which i think is what you were thinking of earlier?), but you need to place the exclusion later (and the shorthand notation doesn’t seem to work in this context):
for example: