I have a filesyste that is represented in an xml document in the following format:
<xml xmlns="namespace1" xmlns:ns2="namespace2">
<entry>
<id>123</id>
<ns2:content name="type">directory</ns2:content>
<ns2:content name="numErrors">3</ns2:content>
</entry>
...
<entry>
<id>456</id>
<ns2:content name="type">file</ns2:content>
<ns2:content name="docState">success</ns2:content>
</entry>
...
</xml>
What I need to do is, using Python’s lxml, retrieve only the entry objects that represent directories. All entries contain a <ns2:content name="docState"> object, but I need to know how to retrieve a list of entry objects where that object’s text is equal to directory. I can do this in several inconvenient steps, but I would rather have one query for it. Here is the way I would do it in steps:
#xml_parse.py
ns={'ns1':'namespace1','ns2':'namespace2'}
for node in tree.xpath("//ns1:entry",namespaces=ns):
if node.find("ns2:content[@name='type']").text=="directory":
#do stuff with node
pass
Can anyone explain how to do this within the for statement instead of using an if?
Thanks
Use the following XPath expression: