I would like to get all the text nodes of a document, but only

Question

0

Editorial Team

Asked: May 30, 20262026-05-30T20:01:12+00:00 2026-05-30T20:01:12+00:00

I would like to get all the text nodes of a document, but only

0

I would like to get all the text nodes of a document, but only those that are NOT part of a hyperlink.

Test sample:

Hello <a class='foobar' href='foo.html'>foo</a>World Hello foo World

The result text nodes should include the text node with Hello foo World, but not the hyperlink.

I tried "//*[not(@href)]/text()" , but this does not appear to work.

UPDATE

As my answer below (hopefully) explains, my problem was that the query was looking for nodes inside the root node, but not the root node itself. My answer is below.

Andrew came up with a different approach that is probably more clear as to intent.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-30T20:01:14+00:00

you can also exclude parents (which i think is what you were thinking of earlier?), but you need to place the exclusion later (and the shorthand notation doesn’t seem to work in this context):

//text()[not(parent::a)]

for example:

> cat foo.xml 
<b>
<a href="href">baz</a>
text
<c>foo<a href="href">bar</a>here</c>
more
</b>

> xpath foo.xml "//text()[not(parent::a)]"
Found 5 nodes:
-- NODE --

-- NODE --

text
-- NODE --
foo-- NODE --
here-- NODE --

more

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I would like to get all the text nodes of a document, but only

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply