I’m making a static analyzer for C.
I have done the lexer and parser using ANTLR in which generates Java code.
Does ANTLR build the AST for us automatically by options {output=AST;}? Or do I have to make the tree myself? If it does, then how to spit out the nodes on that AST?
I am currently thinking that the nodes on that AST will be used for making SSA, followed by data flow analysis in order to make the static analyzer. Am I on the right path?
No, the parser does not know what you want as root and as leaves for each parser rule, so you’ll have to do a bit more than just put
options { output=AST; }in your grammar.For example, when parsing the source
"true && (false || true && (true || false))"using the parser generated from the grammar:the following parse tree is generated:
(i.e. just a flat, 1 dimensional list of tokens)
You’ll want to tell ANTLR which tokens in your grammar become root, leaves, or simply left out of the tree.
Creating AST’s can be done in two ways:
foo : A B C D -> ^(D A B);, wherefoois a parser rule that matches the tokensA B C D. So everything after the->is the actual rewrite rule. As you can see, the tokenCis not used in the rewrite rule, which means it is omitted from the AST. The token placed directly after the^(will become the root of the tree;^and!after a token inside your parser rules where^will make a token the root, and!will delete a token from the tree. The equivalent forfoo : A B C D -> ^(D A B);would befoo : A B C! D^;Both
foo : A B C D -> ^(D A B);andfoo : A B C! D^;will produce the following AST:Now, you could rewrite the grammar as follows:
which will create the following AST from the source
"true && (false || true && (true || false))":Related ANTLR wiki links:
Never did anything like that, but IMO the first thing you’d want is an AST from the source, so yeah, I guess your on the right path! 🙂
EDIT
Here’s how you can use the generated lexer and parser: