Given the following XML-compliant HTML:
<div>
<a>a1</a>
<b>b1</b>
</div>
<div>
<b>b2</b>
</div>
<div>
<a>a3</a>
<b>b3</b>
<c>c3</c>
</div>
doing //a will return:
[a1,a3]
The problem with above is that the third column data is now in second place, when A is not found it is completely skipped.
how can you express an xpath to get all A elements which will return:
[a1, null, a3]
same case for //c, I wonder if it’s possible to get
[null, null, c3]
UPDATE: consider another scenario where are no common parents <div>.
<h1>heading1</h1>
<a>a1</a>
<b>b1</b>
<h1>heading2</h1>
<b>b2</b>
<h1>heading3</h1>
<a>a3</a>
<b>b3</b>
<c>c3</c>
UPDATE: I am now able to use XSLT as well.
There is no null value in XPath. There’s a semi-related question here which also explains this: http://www.velocityreviews.com/forums/t686805-xpath-query-to-return-null-values.html
Realistically, you’ve got three options:
//a | //div[not(a)], which would return thedivelement if there was noawithin it, and have your Java code handle anydiv‘s returned as ‘noaelement present’. Depending on the context, this may even allow you to output something more useful if required, as you’ll have access to the entire contents of the div, for example an error ‘noaelement found in div (some identifier)’.aelements in anydivelement that does not already have one with a suitable default.Your second case is a little tricky, and to be honest, I’d actually recommend not using XPath for it at all, but it can be done:
//a | //h1[not(following-sibling::a) or generate-id(.) != generate-id(following-sibling::a[1]/preceding-sibling::h1[1])]This will match any
aelements, or anyh1elements where no followingaelement exists before the nexth1element, or the end of the document. As Dimitre pointed out though, this only works if you’re using it from within XSLT, asgenerate-idis an XSLT function.If you’re not using it from within XLST, you can use this rather contrived formula:
//a | //h1[not(following-sibling::a) or count(. | preceding-sibling::h1) != count(following-sibling::a[1]/preceding-sibling::h1)]It works by matching
h1elements where the count of itself and all precedingh1elements is not the same as the count of allh1elements preceding the nexta. There may be a more efficient way of doing it in XPath, but if it’s going to get any more contrived than that, I’d definitely recommend not using XPath at all.