I’m trying to parse an OpenOffice spreadsheet to obtain rows with unique values in the first column.
I.E., I would like to retrieve from the following XML fragment all <table:table-row> elements with unique <text:p> values in the first child <table:table-cell>.
<table:table table:name="foo">
<table:table-row>
<table:table-cell>
<text:p>1</text:p>
</table:table-cell>
<table:table-cell>
<text:p>foo</text:p>
</table:table-cell>
</table:table-row>
<table:table-row>
<table:table-cell>
<text:p>2</text:p>
</table:table-cell>
<table:table-cell>
<text:p>bar</text:p>
</table:table-cell>
</table:table-row>
<table:table-row>
<table:table-cell>
<text:p>1</text:p>
</table:table-cell>
<table:table-cell>
<text:p>baz</text:p>
</table:table-cell>
</table:table-row>
</table:table>
I’ll like to get the below output as Nodes
<table:table-row>
<table:table-cell>
<text:p>1</text:p>
</table:table-cell>
<table:table-cell>
<text:p>foo</text:p>
</table:table-cell>
</table:table-row>
<table:table-row>
<table:table-cell>
<text:p>2</text:p>
</table:table-cell>
<table:table-cell>
<text:p>bar</text:p>
</table:table-cell>
</table:table-row>
How can I do this with XPath?
This XPath produces desired output:
/table:table/table:table-row[not(./table:table-cell[1]/text:p/text() = preceding-sibling::table:table-row/table:table-cell[1]/text:p/text())]