I’d like to store some relatively simple stuff in XML in a cascading manner. The idea is that a build can have a number of parameter sets and the Python scripts creates the necessary build artifacts (*.h etc) by reading these sets and if two sets have the same parameter, the latter one replaces the former.
There are (at least) two differing ways of doing the XML:
First way:
<Variants>
<Variant name="foo" Info="foobar">1</Variant
</Variants>
Second way:
<Variants>
<Variant>
<Name>Foo</Name>
<Value>1</Value>
<Info>foobar</Info>
</Variant>
</Variants>
Which one is better easier to handle in ElementTree. My limited understanding claims it would be the first one as I could search the variant with find() easily and receive the entire subtree but would it be just as easy to do it with the second style? My colleague says that the latter XML is better as it allows expanding the XML more easily (and he is obviously right) but I don’t see the expandability a major factor at the moment (might very well be we will never need it).
EDIT: I could of course use lxml as well, does it matter in this case? Speed really isn’t an issue, the files are relatively small.
You’re both right, but I would pick #1 where possible, except for the text content:
1 is much more succinct and human-readable, thus less error-prone.
1 is still pretty extensible. You can always add more attributes or child elements. The only way it isn’t extensible is if you later discover you need multiple values for
name, orinfo(or the text content value)… since you can’t have multiple attributes with the same name on an element (nor multiple text content nodes without something in between). However you can still extend those by various techniques, e.g. space-separated values in an attribute, or adding child elements as an alternative to an attribute.Update: further reading
A few good articles on the XML elements-vs-attributes debate, including when to use each:
See also this SO question (but I think the above give more profitable reading).