I need to analyze some C++ source files in order to generate some very basic information. The thing that I am trying to do goes like this:
- Assume that we have the C++ grammar and the C++ source file to be analyzed
- The analyzer will read the source file like a lexical analyzer, it’ll identify the keywords etc. as defined by the C++ grammar.
-
After reading each line, the analyzer will output the following information: Line#: lexical information. For example, consider this:
int main(int x, int y) { return x+y; }
The program will output:
Line 1: function: main, params: x, y
Line 2: paren "{"
Line 3: keyword: "return"
or something similar.
Can somebody please tell me how to do this? I have looked at Antlr and TXL but I’m guessing that there should be a simpler way. I’d like to write a Java program that’ll do this work.
The first thing that I would like to do is to get the function definitions in a file, with their corresponding line numbers. Any help will be much appreciated.
Thanks,
Anton
Your best bet, as of today, is probably CLang.
While CLang is known for being a C/C++/Objective-C/Objective-C++ frontend on top of LLVM, it has been designed as a set of libraries specifically so that individual components could be re-used outside the compiler itself.
Of interest,
libclangis a C library that wraps the core preprocessing and semantic analysis to provide a “parsed-tree” in C, because C is the lingua franca.libclangnotably serves as a basis for the Python bindings, so if you really want it in Java you should be able to use the JNI (if I remember correctly the terms) to interface with it. Also, thelibclanginterface is extremely stable (unlike the internal compiler representations) as it is meant to be used by external users.The Python bindings have already been used to create
clang_complete, a vim plugin for auto-completion. You can read this blog article about it for example (there is a nifty video showing it in action).Insider note: the Python bindings are currently being significantly improved by Gregory Szorc under the guidance of Tobias Grosser, you can see Gregory’s announcement here.