I have an XML file that looks like this:
xml = '''<?xml version="1.0"?>
<root>
<item>text</item>
<item2>more text</item2>
<targetroot>
<targetcontainer>
<target>text i want to get</target>
</targetcontainer>
<targetcontainer>
<target>text i want to get</target>
</targetcontainer>
</targetroot>
...more items
</root>
'''
With lxml I’m trying to acces the text in the element < target >. I’ve found a solution, but I’m sure there is a better, more efficient way to do this. My solution:
target = etree.XML(xml)
for x in target.getiterator('root'):
item1 = x.findtext('item')
for target in x.iterchildren('targetroot'):
for t in target.iterchildren('targetcontainer'):
targetText = t.findtext('target')
Although this works, as it gives me acces to all the elements in root as well as the target element, I’m having a hard time believing this is the most efficient solution.
So my question is this: is there a more efficient way to access the < target >’s texts while staying in the loop of root, because I also need access to the other elements.
You can use XPath:
We ask all elements that match a path. In this case, the path is
/root/targetroot/targetcontainer/target, which meansAlso, your XML document had two problems. First, the
<?xml version="1.0"?>declaration should be the very first thing in the document – and in this example it is preceded by a newline and some space. Also, it is not a tag and should not be closed, so the</xml>at the end of your string should be removed. I already edited your question anyway.EDIT: this solution can be improved yet. You do not need to pass all the path – you can just ask to all elements
<target>inside the document. This is done by preceding the tag name by two slashes. Since you want all the<target>texts, independent of where they are, this can be a better solution. So, the loop above can be written just as:I tried it at first but it did not worked. The problem, however, was the syntax problems in the XML, not the XPath, but I tried the other, longer path and forgot to retry this one. Sorry! Anyway, I hope I put some light about XPath nonetheless 🙂