I use XSLT to convert this:
1)
<fruit>
<apple count="2"/>
<banana count="3"/>
</fruit>
into this:
2)
Apple: 2
Banana: 3
Is there a library that serves as a parser for text data which allows descriptive declaration of the expected tokens (think both Extended Backus–Naur Form |EBNF| and Lex/Yacc regex-based hints) and builds an XML DOM from it?
Yes. FXSL 2.0 has a function
f:lr-parse()which does exactly that. This is written in pure XSLT 2.0 and implements a general (table-driven) LR-1 parser, that accepts as input an XML file containing the parse tables and a text file containing the “sentence” to be parsed.I have used this function for a number of parsers, ranging from a toy arithmetic expressions to medium-sized — JSON, to very large — XPath 2.0 — languages.
See for example this article in my blog: Transforming JSON