I want to sax-parse in nokogiri, but when it comes to parse xml element that have a long and crazy xml element name or a attribute on it.. then everthing goes crazy.
Fore instans if I like to parse this xml file and grab all the title element, how do I do that with nokogiri-sax.
<titles>
<title xml:lang="sv">Arkivvetenskap</title>
<title xml:lang="en">Archival science</title>
</titles>
In your example,
titleis the name of the element.xml:lang="sv"is an attribute.This parser assumes there are no elements nested inside of title elements
This prints
SAX parsing is usually way too complex. Because of that, I recommend Nokogiri’s standard in-memory parser, or if you really need speed and memory efficiency, Nokogiri’s Reader parser.
For comparison, here is a standard Nokogiri parser for the same document
And here is a reader parser for the same document