I am using Ruby to retrieve an XML document with the following format:
<project>
<users>
<person>
<name>LUIS</name>
</person>
<person>
<name>JOHN</name>
</person>
</users>
</project>
I want to know how to produce the following result, with the tags concatenated:
<project>
<users>
<person>
<name>LUIS JOHN</name>
</person>
</users>
</project>
Here is the code I am using:
file = File.new( "proyectos.xml" )
doc3 = Nokogiri::XML(file)
a=0
@participa = doc3.search("person")
@participa.each do |i|
@par = @participa.search("name").map { |node| node.children.text }
@par.each do |i|
puts @par[a]
puts '--'
a = a + 1
end
end
Rather than supply code, here’s how to fish:
To parse your XML into Nokogiri, which I recommend highly:
That gives you a
docvariable which is the DOM as a Nokogiri::XML::Document. From that you can search, either for matching nodes or a particular node.searchallows you to pass an XPath or CSS accessor to locate what you are looking for. I recommend CSS for most things because it is more readable, but XPath has some great tools to dig into the structure of your XML, so often I end up with both in my code.So,
doc.at('users')is the CSS accessor to find the firstusersnode.doc.search('person')will return all nodes matching thepersontag as aNodeSet, which is basically an array which you can enumerate or loop over.Nokogiri has a
textmethod for a node that lets you get the text content of that node, including all the carriage-returns between nodes that would normally be considered formatting in the XML as it flows down the document. When you have the text of the node, you can apply the normal Ruby string processing commands, such asstrip,squish,chomp, etc., to massage the text into a more usable format.Nokogiri also has a
children=method which lets you redefine the child nodes of a node. You can pass in a node you’ve created, a NodeSet, or even the text you want rendered into the XML at that point.In a quick experiment, I have code that does what you want in basically four lines. But, I want to see your work before I share what I wrote.
Finally,
puts doc.to_xmlwill let you easily see if your changes to the document were successful.Here’s how I’d do it:
The XML is parsed into a DOM now. Search for the
userstags, then locate the embeddednametags and extract the text from them. Join the results into a single space-delimited string. Then replace the children of theuserstag with the desired results:If you output the resulting XML you’ll get: