Does anyone have a complete implementation (possibly github or googlecode) for using an ANTLR grammar file and Java source code to analyze Java source. For example, I want to simply be able to count the number of variables, method, etc.
Also using a recent version of ANTLR.
I thought I’d take a crack at this over my lunch break. This may not completely solve your problem, but it might give you a place to start. The example assumes you’re doing everything in the same directory.
Download the ANTLR source from GitHub. The pre-compiled “complete” JAR from the ANTLR site contains a known bug. The GitHub repo has the fix.
Extract the ANTLR tarball.
Build the ANTLR “complete” JAR.
Download a Java grammar. There are others, but I know this one works.
Compile the grammar to Java source.
Compile the Java source.
Add the following source file, Main.java.
Compile.
Select a type of Java source that you want to count; for example,
VAR_DECLARATOR,FUNCTION_METHOD_DECL, orVOID_METHOD_DECL.Run on any file, including the recently created Main.java.
This is imperfect, of course. If you look closely, you may have noticed that the local variable of the enhanced
forstatement wasn’t counted. For that, you’d need to use the typeFOR_EACH, rather thanVAR_DECLARATOR.You’ll need a good understanding of the elements of Java source, and be able to take reasonable guesses at how those match to the definitions of this particular grammar. You also won’t be able to do counts of references. Declarations are easy, but counting uses of a field, for example, requires reference resolution. Does
p.C.frefer to a static fieldfof a classCinside a packagep, or does it refer to an instance fieldfof the object stored by a static fieldCof a classp? Basic parsers don’t resolve references for languages as complex as Java, because the general case can be very difficult. If you want this level of control, you’ll need to use a compiler (or something closer to it). The Eclipse compiler is a popular choice.I should also mention that you have other options besides ANTLR. JavaCC is another parser generator. The static analysis tool PMD, which uses JavaCC as its parser generator, allows you to write custom rules that could be used for the kinds of counts you indicated.