The documentation for scala.xml.pull.XMLEventReader mentions it can be used as an Iterator[XMLEvent]. However, when doing so, XML errors lead to method calls not terminating. For example:
scala> new xml.pull.XMLEventReader(io.Source.fromString("<a><b></a>")).toArray
Exception in thread "XMLEventReader" scala.xml.parsing.FatalError: expected closing tag of b
at scala.xml.parsing.MarkupParser$class.errorNoEnd(MarkupParser.scala:41)
at scala.xml.pull.XMLEventReader$Parser.errorNoEnd(XMLEventReader.scala:56)
at scala.xml.parsing.MarkupParserCommon$class.xEndTag(MarkupParserCommon.scala:93)
at scala.xml.pull.XMLEventReader$Parser.xEndTag(XMLEventReader.scala:56)
at scala.xml.parsing.MarkupParser$class.element1(MarkupParser.scala:543)
at scala.xml.pull.XMLEventReader$Parser.element1(XMLEventReader.scala:56)
at scala.xml.parsing.MarkupParser$class.content1(MarkupParser.scala:396)
at scala.xml.pull.XMLEventReader$Parser.content1(XMLEventReader.scala:56)
at scala.xml.parsing.MarkupParser$class.content(MarkupParser.scala:417)
at scala.xml.pull.XMLEventReader$Parser.content(XMLEventReader.scala:56)
at scala.xml.parsing.MarkupParser$class.element1(MarkupParser.scala:542)
at scala.xml.pull.XMLEventReader$Parser.element1(XMLEventReader.scala:56)
at scala.xml.parsing.MarkupParser$class.content1(MarkupParser.scala:396)
at scala.xml.pull.XMLEventReader$Parser.content1(XMLEventReader.scala:56)
at scala.xml.parsing.MarkupParser$class.document(MarkupParser.scala:216)
at scala.xml.pull.XMLEventReader$Parser.document(XMLEventReader.scala:56)
at scala.xml.pull.XMLEventReader$Parser$$anonfun$run$1.apply(XMLEventReader.scala:90)
at scala.xml.pull.XMLEventReader$Parser$$anonfun$run$1.apply(XMLEventReader.scala:90)
at scala.xml.pull.ProducerConsumerIterator$class.interruptibly(XMLEventReader.scala:113)
at scala.xml.pull.XMLEventReader.interruptibly(XMLEventReader.scala:26)
at scala.xml.pull.XMLEventReader$Parser.run(XMLEventReader.scala:90)
at java.lang.Thread.run(Thread.java:680)
This call never terminates. We see here that the parse exception is printed but it doesn’t appear to interrupt the call to toArray. This seems to be because the actual parsing happens in a separate thread, which is terminated, however the error is never reported to the calling thread (this is described in the issue SI-4267). Is it possible to somehow have these exceptions re-raised on the calling thread? Is this class even intended to be used, or is there another pull parser I should be using?
If you are looking for pull parsing and Scala you should probably checkout Scales Xml.
In particular for this case pull parsing is driven by actual pull parsers (jdk stax) and the actual XMLInputFactory used can be plugged in allowing you to customise error handling or document processing as per the stax standard api.
Add to that the ability to parse by both Iterator and Iteratee and you’ve got a lot of flexibility on how you handle a document.
The next version 0.5 will also attempt to use Aalto XML to provide fully asynchronous processing.
Your actual example converts to:
and runs as (saved in experiments.scalaScript and loaded via the repl):
For more examples of pull parsing see here