I have two java classes that are very similar in semantics but differ in syntax. The differences are minor, like –
Changes in variable names,
Changes in position of some statements (with no dependent lines in between),
Extra imports, etc.
I need to compare these two classes to prove that they are indeed semantically identical. The same needs to be done for a large number of java file pairs.
The first approach of reading from the two files and comparing the lines, with logic to deal with the differences mentioned above seems inefficient. Is there some other way that I can achieve this task? Any helpful APIs out there?
Compile both of the classes without debug information and then decompile them back to source files. The decompiled files should be a lot more similar than the original source files.
You can improve this further by running some optimizations on the compiled files. For example you can use Proguard with just shrinking enabled to removed unused code.
Changes in position of some statements can be hard to detect though.