We have a large set of C++ projects (GCC, Linux, mostly static libraries) with many dependencies between them. Then we compile an executable using these libraries and deploy the binary on the front-end. It would be extremely useful to be able to identify that binary. Ideally what we would like to have is a small script that would retrieve the following information directly from the binary:
$ident binary
$binary : Product=PRODUCT_NAME;Version=0.0.1;Build=xxx;User=xxx...
$ dependency: Product=PRODUCT_NAME1;Version=0.1.1;Build=xxx;User=xxx...
$ dependency: Product=PRODUCT_NAME2;Version=1.0.1;Build=xxx;User=xxx...
So it should display all the information for the binary itself and for all of its dependencies.
Currently our approach is:
-
During compilation for each product we generate Manifest.h and Manifest.cpp and then inject Manifest.o into binary
-
ident script parses target binary, finds generated stuff there and prints this information
However this approach is not always reliable for different versions of gcc..
I would like to ask SO community – is there better approach to solve this problem?
Thanks for any advice
One of the catches with storing data in source code (your
Manifest.hand.cpp) is the size limit for literal data, which is dependent on the compiler.My suggestion is to use
ld. It allows you to store arbitrary binary data in your ELF file (so doesobjcopy). If you prefer to write your own solution, have a look atlibbfd.Let us say we have a
hello.cppcontaining the usual C++ “Hello world” example. Now we have the following make file (GNUmakefile):What I’m doing here is to separate out the linking stage, because I want the manifest (after conversion to ELF object format) linked into the binary as well. Since I am using suffix rules this is one way to go, others are certainly possible, including a better naming scheme for the manifests where they also end up as
.ofiles and GNU make can figure out how to create those. Here I’m being explicit about the recipe. So we have.omfiles, which are the manifests (arbitrary binary data), created from.manifestfiles. The recipe states to convert the binary input into an ELF object. The recipe for creating the.manifestitself simply pipes a string into the file.Obviously the tricky part in your case isn’t storing the manifest data, but rather generating it. And frankly I know too little about your build system to even attempt to suggest a recipe for the
.manifestgeneration.Whatever you throw into your
.manifestfile should probably be some structured text that can be interpreted by the script you mention or that can even be output by the binary itself if you implement a command line switch (and disregard.sofiles and.sofiles hacked into behaving like ordinary executables when run from the shell).The above make file doesn’t take into account the dependencies – or rather it doesn’t help you create the dependency list in any way. You can probably coerce GNU make into helping you with that if you express your dependencies clearly for each goal (i.e. the static libraries etc). But it may not be worth it to take that route …
Also look at:
If you want particular names for the symbols generated from the data (in your case the manifest), you need to use a slightly different route and use the method described by John Ripley here.
How to access the symbols? Easy. Declare them as external (C linkage!) data and then use them:
The symbols are the exact characters/bytes. You could also declare them as
char[], but it would result in problems down the road. E.g. for theprintfcall.The reason I am calculating the size myself is because a.) I don’t know whether the buffer is guaranteed to be zero-terminated and b.) I didn’t find any documentation on interfacing with the
*_sizevariable.Side-note: the
*in the format string tellsprintfthat it should read the length of the string from the argument and then pick the next argument as the string to print out.