I’m working on a project for school with converting a BNF form Decaf spec into a context-free grammar and building it in ANTLR. I’ve been working on it for a few weeks and been going to the professor when I’ve become stuck, but I finally ran into something that he says should not be causing an error. Here’s the isolated part of my grammar, expr is the starting point. Before I do that I have one question.
Does it matter if my lexer rules appear before my parser rules in my grammar, or if they’re mixed in intermittently through my grammar file?
calloutarg: expr | STRING;
expr: multexpr ((PLUS|MINUS) multexpr)* ;
multexpr : atom ((MULT|DIVISION) atom)*
;
atom : OPENPAR expr CLOSEPAR | ID ((OPENBRACKET expr CLOSEBRACKET)? | OPENPAR ((expr (COMMA)* )+)? CLOSEPAR)|
CALLOUT OPENPAR STRING (COMMA (calloutarg)+ COMMA)? CLOSEPAR | constant;
constant: INT | CHAR | boolconstant;
boolconstant: TRUE|FALSE;
The ugly formatting is because part of his advice for debugging was to take individual rules and break them down where the ambiguity is to see where the errors are starting. In this case, it’s saying the problem is in the long ID portion, that OPENBRACKET and OPENPAR are the cause. If you have any ideas at all, I am deeply appreciative. Thank you, and sorry for how nasty the formatting is on the code I posted.
No, that does not matter.
The problem is that inside your
atomrule, ANTLR cannot make a choice between these three variants:ID ( ...ID [ ...IDwithout resorting to (possibly) backtracking. You could resolve it by using some syntactic predicates (which looks like:
(...)=> ...). A syntactic predicates is nothing more than a “look ahead” and if this “look ahead” is successful, it chooses that particular path.Your current
atomrule can be rewritten as follows:And with the predicates it will look like:
which should do the trick.
Note: do not use ANTLRWorks to generate or test the parser! It cannot handle predicates (well). Best do it on the command line.
Also see: https://wincent.com/wiki/ANTLR_predicates
EDIT
Let’s label the six different “branches” from your
atomrule fromAtoF:Now, when the (future) parser should handle input like this:
ANTLR does not know how the parser should handle it. It could be parsed in two different ways:
Dfollowed by branchACWhich is the source of the ambiguity ANTLR is complaining about. If you were to comment out one of the branches
A,CorD, the error would disappear.Hope that helps.