Some days ago I accidentally opened a C++ executable of a commercial application in Notepad++ and found out that there’s quite a lot information about the original source code stored in the executable.
Inside the executable I could find file names (app.c, dlgstat.c, …), function names (GetTickCount, DispatchMessageA, …) and small pieces of source code, mostly conditions (szChar != TEXT('\0'), iRow < XTGetRows( hwndList )). After that I checked another QT executable and: yes again source file names and method signatures.
Because of that I am wondering how much source code information is really stored in a C/C++ executable (e.g., compiled using QT or MinGW). Is this probably some kind of debug build still containing the original source? Is this information used for some reflection stuff? Is there any reason why publishers don’t remove this stuff?
In practice, not much. The source code is not required at runtime. The strings you name come from two things:
The function names (e.g.
GetTickCount) are the names of functions imported from other modules. The names are required at runtime because the functions are resolved dynamically (by callingGetProcAddresswith the function name).The conditions are likely assertions: the
assertmacro stringizes its argument so that when it fires you know what condition was not met.If you build a DLL, it will also contain a names of all of the functions it exports, so they can be resolved at runtime (the same is likely true for other shared object formats).
Debug symbols may also contain some of the original source code, though it depends on the format used by the debug symbols. These symbols may be contained either in the binary itself or in an auxiliary file (for example, .pdb files used on Windows).