I have a simple grammar:
grammar sample;
options { output = AST; }
assignment
: IDENT ':=' expr ';'
;
expr
: factor ('*' factor)*
;
factor
: primary ('+' primary)*
;
primary
: NUM
| '(' expr ')'
;
IDENT : ('a'..'z')+ ;
NUM : ('0'..'9')+ ;
WS : (' '|'\n'|'\t'|'\r')+ {$channel=HIDDEN;} ;
Now I want to add some rewrite rules to generate an AST. From what I’ve read online and in the Language Patterns book, I should be able to modify the grammar like this:
assignment
: IDENT ':=' expr ';' -> ^(':=' IDENT expr)
;
expr
: factor ('*' factor)* -> ^('*' factor+)
;
factor
: primary ('+' primary)* -> ^('+' primary+)
;
primary
: NUM
| '(' expr ')' -> ^(expr)
;
But it does not work. Although it compiles fine, when I run the parser I get a RewriteEmptyStreamException error. Here’s where things get weird.
If I define the pseudo tokens ADD and MULT and use them instead of the tree node literals, it works without error.
tokens { ADD; MULT; }
expr
: factor ('*' factor)* -> ^(MULT factor+)
;
factor
: primary ('+' primary)* -> ^(ADD primary+)
;
Alternatively, if I use the node suffix notation, it also appears to work fine:
expr
: factor ('*'^ factor)*
;
factor
: primary ('+'^ primary)*
;
Is this discrepancy in behavior a bug?
No, not a bug, AFAIK. Take your
exprrule for example:since the
*might not be present, it should also not be in your AST rewrite rule. So, the above is incorrect and ANTLR complaining about it is correct.Now if you insert an imaginary token like
MULTinstead:all is okay since your rule will always produce one or more
factor‘s.What you probably meant to do is something like this:
Also see chapter 7: Tree Construction from The Definitive ANTLR Reference. Especially the paragraphs Rewrite Rules in Subrules (page 173) and Referencing Previous Rule ASTs in Rewrite Rules (page 174/175).