So I have to write a “duplicate checker” to compare two XMLs and see if they are the same (contain the same data). Now because they come from the same class and are generated form an XSD the structure the order of the elements inside will most likely be the same.
The best way I can think of doing the duplicate check is to set up two dictionaries (dictLeft, dictRight) and saving the xpath#value as the key and the number of times it occurs. Something like this:
Left:
{ 'my/path/to/name#greg': 1, 'my/path/to/name#john': 2, 'my/path/to/car#toyota': 1}
Right
{ 'my/path/to/name#greg': 1, 'my/path/to/name#bill': 1, 'my/path/to/car#toyota': 1}
Comparing these two dictionaries will give me a fairly accurate indication of whether or not these two XMLs are the same or not (there is the odd chance that I may get false results, but it is very remote).
Does anyone else have a better idea? Maybe a function in ElementTree that I do not know about?
EDIT: To better explain:
<root><person><name>Bob</name><surname>marley</surname></root>
and
<root><person><surname>marley</surname><name>Bob</name></root>
would be considered the same. I am ignoring attributes. The idea is to keep the code as simple as possible while not hampering performance too much.
OK, so I had to make a decision and went with this:
I hope this makes sense. This allows me to test for specific/all xpaths. If someone has a better algorithm, I’m all ears 🙂