I’m trying to parse a language using ANTLR which can contain the following syntax:
someVariable, somVariable.someMember, functionCall(param).someMember, foo.bar.baz(bjork).buffalo().xyzzy
This is the ANTLR grammar which i’ve come up with so far, and the access_operation throws the error
The following sets of rules are mutually left-recursive [access_operation, expression]:
grammar Test;
options {
output=AST;
ASTLabelType=CommonTree;
}
tokens {
LHS;
RHS;
CALL;
PARAMS;
}
start
: body? EOF
;
body
: expression (',' expression)*
;
expression
: function -> ^(CALL)
| access_operation
| atom
;
access_operation
: (expression -> ^(LHS)) '.'! (expression -> ^(RHS))
;
function
: (IDENT '(' body? ')') -> ^(IDENT PARAMS?)
;
atom
: IDENT
| NUMBER
;
fragment LETTER : ('a'..'z' | 'A'..'Z');
fragment DIGIT : '0'..'9';
IDENT : (LETTER)+ ;
NUMBER : (DIGIT)+ ;
SPACE : (' ' | '\t' | '\r' | '\n') { $channel=HIDDEN; };
What i could manage so far was to refactor the access_operation rule to '.' expression which generates an AST where the access_operation node only contains the right side of the operation.
What i’m looking for instead is something like this:

How can the left-recursion problem solved in this case?
By “wrong AST” I’ll make a semi educated guess that, for input like
"foo.bar.baz", you get an AST wherefoois the root withbaras a child who in its turn hasbazas a child, which is a leaf in the AST. You may want to have this reversed. But I’d not go for such an AST if I were you: I’d keep the AST as flat as possible:That way, evaluating is far easier: you simply look up
foo, and then walk from left to right through its children.A quick demo:
which can be tested with:
The output of
Maincorresponds to the following AST:EDIT
And since you indicated your ultimate goal is not evaluating the input, but that you rather need to conform the structure of the AST to some 3rd party API, here’s a grammar that will create an AST like you indicated in your edited question:
which creates the following AST if you run the
Mainclass:The
atomrule may be a bit daunting, but you can’t shorten it much since the leftIDneeds to be available to most of the alternatives. ANTLRWorks helps in visualizing the alternative paths this rule may take:which means
atomcan be any of the 5 following alternatives (with their corresponding AST’s):