I have some code that is written in french, that is the variables, class, function all have french names. The comments are also in french. I’d like to translate the code to english. This will be quite a challenge, since it’s a 18K lines project and I’d like to know if there is any tool that could help me, especially with the variables/class/function names, since it will be error prone to rename them all.
Is there any tools that can help me? Advices?
edit : I’m not looking for machine translation. I’m looking for a tool that would help me translate the code. Let’s say there is class name C and this class has a method named TraverserLaRue and I rename it CrossTheRoad I’d like all references to TraverserLaRue in all files to be translated as CrossTheRoad. However I don’t want the method TraverserLaRue of class B to be translated.
I assume the langauge in question is one of the common ones, such as C, C++, C#, Java, …
(You don’t have a language with French keywords? I once encountered an entirely Swedish version of Pascal, and I gave up on working that).
So you have two problems:
Since comments contain arbitrary natural language text, you’ll need an arbitrary translation of them. I don’t think you can find an automated tool to do that.
Unlike others, however, I think you have a decent chance at translating the identifiers
and changing them en masse.
SD makes a line of source code “obfuscator” products. These tools don’t process the code as raw text, rather they process the source code in terms of the targeted language; they accurately distinguish identifiers from operators, numbers, comments etc. In particular, they
operate reliably as need on just the identifiers.
One of the things these tools do is to replace one identifier name by another (usually a nonsense name) to make the code really hard to understand. Think abstractly of a map of identifier names I -> N. (They do other things, but that’s not interesting here). Because you often want to re-obfuscate a file that has changed, the same way as an original, these tools allow you to reuse a previous cycle’s identifier map, which is represented as list of I -> N pairs.
I think you can abuse this to do what you want.
Step 1: Run such an obfuscator on your original French code. This will produce a text file containing all the identifiers in the code as a map of the form
You don’t care about the Ns, just the I’s.
Step 2: Manually translate each French I to an English name E you think fits best.
(I have no specific suggestions about how to do this; some of the other answers here
have suggestions).
Some of the I’s are likely to be library calls and are thus already correct.
You can modify the text obfuscation map file to be:
Step 3: Run the obfuscation tool, and make it use your modified obfuscation map.
It can be told to do that.
Viola, all the identifiers in your code will be changed the way you specify.
[You may get, as a freebie, the re-formatting of your original text. These tools can also format code nicely. Your name changes are likely to screw up the indentation/spacing in the original text so this is a nice bonus].