Given the following XML-compliant HTML: <div> <a>a1</a> <b>b1</b> </div> <div> <b>b2</b> </div> <div> <a>a3</a>

Question

0

Asked: May 31, 20262026-05-31T15:30:37+00:00 2026-05-31T15:30:37+00:00

Given the following XML-compliant HTML: <div> <a>a1</a> <b>b1</b> </div> <div> <b>b2</b> </div> <div> <a>a3</a>

0

Given the following XML-compliant HTML:

<div>
 <a>a1</a>
 <b>b1</b>
</div>

<div>
 <b>b2</b>
</div>

<div>
 <a>a3</a>
 <b>b3</b>
 <c>c3</c>
</div>

doing //a will return:

[a1,a3]

The problem with above is that the third column data is now in second place, when A is not found it is completely skipped.

how can you express an xpath to get all A elements which will return:

[a1, null, a3]

same case for //c, I wonder if it’s possible to get

[null, null, c3]

UPDATE: consider another scenario where are no common parents <div>.

<h1>heading1</h1>
 <a>a1</a>
 <b>b1</b>


<h1>heading2</h1>
 <b>b2</b>


<h1>heading3</h1>
 <a>a3</a>
 <b>b3</b>
 <c>c3</c>

UPDATE: I am now able to use XSLT as well.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-31T15:30:38+00:00

There is no null value in XPath. There’s a semi-related question here which also explains this: http://www.velocityreviews.com/forums/t686805-xpath-query-to-return-null-values.html

Realistically, you’ve got three options:

Don’t use XPath at all.
Use this: //a | //div[not(a)], which would return the div element if there was no a within it, and have your Java code handle any div‘s returned as ‘no a element present’. Depending on the context, this may even allow you to output something more useful if required, as you’ll have access to the entire contents of the div, for example an error ‘no a element found in div (some identifier)’.
Preprocess your XML with an XSLT that inserts a elements in any div element that does not already have one with a suitable default.

Your second case is a little tricky, and to be honest, I’d actually recommend not using XPath for it at all, but it can be done:

//a | //h1[not(following-sibling::a) or generate-id(.) != generate-id(following-sibling::a[1]/preceding-sibling::h1[1])]

This will match any a elements, or any h1 elements where no following a element exists before the next h1 element, or the end of the document. As Dimitre pointed out though, this only works if you’re using it from within XSLT, as generate-id is an XSLT function.

If you’re not using it from within XLST, you can use this rather contrived formula:

//a | //h1[not(following-sibling::a) or count(. | preceding-sibling::h1) != count(following-sibling::a[1]/preceding-sibling::h1)]

It works by matching h1 elements where the count of itself and all preceding h1 elements is not the same as the count of all h1 elements preceding the next a. There may be a more efficient way of doing it in XPath, but if it’s going to get any more contrived than that, I’d definitely recommend not using XPath at all.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Given the following XML-compliant HTML: <div> <a>a1</a> <b>b1</b> </div> <div> <b>b2</b> </div> <div> <a>a3</a>

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply