I am new to python. I need the text from “title” and “pubDate” elements, but only from the first instance of these elements. I’ve been trying “lxml”:
tree=etree.parse('doc.xml')
x = tree.findtext("rss/channel/item/title")
y = tree.findtext("rss/channel/item/pubDate")
print x, y
I keep getting None, None in output.
Here is the xml file:
<rss version="2.0">
<channel>
<title>Dynamic rss from aaaa.aaaa search</title>
<link>http://aaaaa.aaaa.info</link>
<ttl>30</ttl>
<description>RSS feed for selected show/news</description>
<item>
<title>
<![CDATA[ AAAAAAA 7x16 (HDTV-LOL) [VTV] ]]>
</title>
<pubDate>Mon, 13 Feb 2012 00:00:00 GMT</pubDate>
<link>
<![CDATA[
http://torrent.zoink.it/AAAAAAAA.7x16.(HDTV-LOL)[VTV].torrent
]]>
</link>
<description>
<![CDATA[
AAAAAAAA 7x16 (HDTV-LOL) [VTV] - http://torrent.zoink.it/AAAAAAA.7x16.(HDTV-LOL[VTV].torrent
]]>
</description>
findtextlooks for text, but you’re looking for nodes by XPath, so use thexpathmethod:Note the
[]: thexpathmethod returns lists of elements.