I need to use XPath to select distinct elements based on multiple child elements.
I have some XML like this:
<Cars>
<Car>
<Make>SuperCar</Make>
<Year>2009</Year>
</Car>
<Car>
<Make>SuperCar</Make>
<Year>2010</Year>
</Car>
<Car>
<Make>AwesomeCar</Make>
<Year>2010</Year>
</Car>
<Car>
<Make>SuperCar</Make>
<Year>2009</Year>
</Car>
</Cars>
And I need to use XPath (I’m limited to XPath 1.0) to select only distinct elements, where distinct is defined by both Make and Year.
In the example above this would mean returning all of the Car objects except the last one (as this is a duplicate of the first Car).
In a typical situation (i.e. where objects were identified by a single key field) I would use an XPath similar to this:
/*/Car[not(./Id/text()=following-sibling::Car/Id/text())]
However, I can’t quite work out how to adapt this to correctly use multiple fields.
I considered something like this:
/*/Car[not(./Make/text()=following-sibling::Car/Make/text()
and ./Model/text()=following-sibling::Car/Model/text())]
But that checks if the Make is used anywhere and if the Model is used anywhere, without ensuring that they are necessarily used on the same Car. For example it incorrectly excludes the first Car from the following sample XML:
<Cars>
<Car>
<Make>SuperCar</Make>
<Year>2009</Year>
</Car>
<Car>
<Make>SuperCar</Make>
<Year>2010</Year>
</Car>
<Car>
<Make>AwesomeCar</Make>
<Year>2009</Year>
</Car>
</Cars>
Any ideas?
It would seem this is impossible using XPath1.0:
How to select distinct values from XML document using XPATH?