Is there a well-known parser description language (like Backus-Naur) that allows for repetitions where the number of repetitions is extracted from the token stream? For bonus points, are there any C++ libraries that support this syntax?
Example:
Lets call the “meta-token” #, then I’m looking for a description language that would treat a production rule of the following form:
RULE = # EXPRESSION
As:
RULE = '1' EXPRESSION
| '2' EXPRESSION EXPRESSION
| '3' EXPRESSION EXPRESSION EXPRESSION
| '4' EXPRESSION EXPRESSION EXPRESSION EXPRESSION
| ...
Note that the counts are actual character literals. This is in contrast to augmented Backus-Naur form, where we can have rules of the form:
RULE = 2*3EXPRESSION
Which are equivalent to:
RULE = EXPRESSION EXPRESSION
| EXPRESSION EXPRESSION EXPRESSION
Response to dgarant:
I’m not sure that’s quite what I want. I’m thinking something along the following lines:
int i;
bool r = phrase_parse(first, last,
(
int_[ phoenix::ref(i) = _1] >> repeat(i)[/*EXPRESSION*/]
)
space );
More importantly though I was hoping for some formalized schema that could describe this idea. On a side node, Spirit does take some getting use to, but is pretty awesome. I’m a fan.
I can’t think of a formal language which allows
rule = # EXPRESSIONto specify repetition where#is a character literal. In my opinion, it shouldn’t be a problem to abuse the formal language specification provided you make a comment to clarify what you mean. If you really want to stick to standards, you could do the following in ABNF:It doesn’t look exactly like what you want but it gets the job done.
I believe boost::spirit::qi can suit your needs for parsing. Have a look at the repeat directive.
Spirit would allow you to write rules such as
If you’re interested in determining the number of repetitions that were parsed, you can append another action to the rule:
[phoenix::ref(pCt) = qi::_a]The style of Spirit::Qi parsers takes a while to get used to, but they’re very powerful since you can integrate them directly into your code.