i try to extract data from a xml in rails application with the Nokogiri gem,
the xml :
<item>
<description>
<![CDATA[<img src="something" title="anothething">
<p>text, bla bla...</p>]]>
</description>
</item>
Actually i do something like this to extract data from the xml :
def test_content
@return = Array.new
site = 'http://www.les-encens.com/modules/feeder/rss.php?id_category=0'
@doc = Nokogiri::XML(open(site, "UserAgent" => "Ruby-OpenURI"))
@doc.xpath("//item").each do |n|
@return << [
n.xpath('description')
]
end
end
Could you show me how extract just the src attribute from the img tag ?
Edit :
I have replace the xml by the correct one.
The result of an xpath call made in Nokogiri is going to be a NodeSet, which is simply a list of Nokigiri Nodes
With this in mind we can just pull examples from the Nokogiri Documentation and adapt them.
To answer your question, “Could you show me how extract just the src attribute from the img tag ?”, here is one such way.
As Phrogz notes below, a more idomatic way of pulling the ‘src’ attribute from all of the images nodes is to map the ‘src’ attribute directly rather than iterating and pushing onto an array.