I’m trying to build a simple grammar to parse a .Net type name string, supporting generics. I admit to being completely new to building grammars in any language. A type string might look like the following.
Foo.Bar.Blah(Mom.Dad, Son.Daughter(Frank.Bob), Dog)
Basically, it’s recursive. Ya’ll should understand this.
I’m completely out in the woods with this one. Not sure how to begin. What I’ve built currently, which doesn’t actually work, is this:
tree grammar XmlTypeName;
options {
language=CSharp2;
}
RPAREN
: '('
;
LPAREN
: ')'
;
SEP
: ','
;
TYPE
: ('a'..'z'|'A'..'Z'|'0'..'9'|'_')+
;
prog
: type;
type
: TYPE (RPAREN type (SEP type)? LPAREN)? (EOF)?
;
This doesn’t even get close to working. Antlr3.exe throws errors saying that RPARAM and LPARAM aren’t allowed in a tree parser. Is a tree parser even what I need?
I’d like to produce a simple AST that lets me navigate down the types.
No, you shouldn’t use a tree grammar. A tree grammar is used after a parser has created an AST. Simply remove the keyword
treefrom it.A couple of other remarks:
types inside parenthesis, but you usedtype (SEP type)?, which matches one or twotypes. You’ll needtype (SEP type)*instead;.inside thetypes;Something like this will do the trick, most probably:
However, the above just creates a flat list of tokens. If you want to create a proper AST, you need to “tell” ANTLR which nodes/tokens are root tokens, and which ones to discard (like the comma’s, parenthesis, …).
which creates the following AST:
More info about creating AST’s with ANTLR: How to output the AST built using ANTLR?