I have two XML files. The first XML has a bunch of nodes that should be present in second XML as well. The second XML might have a few extra nodes as well. I need a Java based program that can automate this check – i.e. it should tell me that given two XML files, all the nodes of the first file is present in the second xml.
I am looking at Java + XMLUnit. However XMLUnit does not have a exact solution for this. Help please.
Thanks.
First things first. Let me go on record and say that XMLUnit is a gem. I loved it. If you are looking at some unit testing of XML values / attributes / structure etc. chances are that you will find a readymade solution with XMLUnit. This is a good place to start from.
It is quite extensible. It already comes with an identity check (as in the XMLs have the same elements and attributes in the same order) or similarity check (as in the XMLs have the same elements and attributes regardless of the order).
However, in my case I was looking for a slightly different usage. I had a big-ish XML (a few hundred nodes), and a bunch of XML files (around 350,000 of them). I needed to not compare certain particular nodes, that I could identify with XPATH. They were not necessarily always in the same position in the XML but there were some generic way of identifying them with XPATH. Sometimes, some nodes were to be ignored based on values of some other nodes. Just to give some idea
The logic here is on the node that I want to ignore i.e price.
/bookstore/book[price>35]/price
The logic here is on a node that is at a relative position. I want to ignore author based on the value of price. And these two are related by position.
/bookstore/book[price=30]/./author
After much tinkering around, I settled for a low tech solution. Before using XMLUnit to compare the files, I used XPATH to mask the values of the nodes that were to be ignored.
Hope this helps.