Sometimes, when using macros to generate code, it is necessary to create identifiers that have global scope but which aren’t really useful for anything outside the immediate context where they are created. For example, suppose it’s necessary to compile-time-allocate an array or other indexed resource into chunks of various sizes.
/* Produce an enumeration of some story-book characters, and allocate some
arbitrary index resource to them. Both enumeration and resource indices
will start at zero.
For each name, defines HPID_xxxx to be the enumeration of that name.
Also defines HP_ID_COUNT to be the total number of names, and
HP_TOTAL_SIZE to be the total resource requirement, and creates an
array hp_starts[HP_ID_COUNT+1]. Each character n is allocated resources
from hp_starts[n] through (but not including) hp_starts[n+1].
*/
/* Give the names and their respective lengths */
#define HP_LIST \
HP_ITEM(FRED, 4) \
HP_ITEM(GEORGE, 6) \
HP_ITEM(HARRY, 5) \
HP_ITEM(RON, 3) \
HP_ITEM(HERMIONE, 8) \
/* BLANK LINE REQUIRED TO ABSORB LAST BACKSLASH */
#define HP_ITEM(name, length) HPID_##name,
typedef enum { HP_LIST HP_ID_COUNT} HP_ID;
#undef HP_ITEM
#define HP_ITEM(name, length) ZZQ_##name}; enum {ZZQX_##name=ZZQ_##name+(length)-1,
enum { HP_LIST HP_TOTAL_SIZE};
#undef HP_ITEM
#define HP_ITEM(name, length) ZZQ_##name,
const unsigned char hp_starts[] = { HP_LIST HP_TOTAL_SIZE};
#undef HP_ITEM
#include "stdio.h"
void main(void)
{
int i;
printf("ID count=%d Total size=%d\n",HP_ID_COUNT,HP_TOTAL_SIZE);
for (i=0; HP_ID_COUNT > i; i++) /* Reverse conditional to avoid lt sign */
printf(" %2d=%3d/%3d\n", i, hp_starts[i], hp_starts[i+1]-hp_starts[i]);
printf("IDs are: \n");
#define HP_ITEM(name, length) printf(" %2d=%s\n",HPID_##name, #name);
HP_LIST
#undef HP_ITEMS
}
Is there any normal convention for naming such identifiers to minimize the likelihood of conflicts, and also to minimize any confusion they might generate? In the above scenario, identifiers ZZQ_xxx will be the same as hp_starts[HPID_xxx], and might in some contexts be useful, though their primary purpose is to build the array and serve as placeholders in computing other ZZQ values and HP_TOTAL_SIZE. Identifiers ZZQX_xxx are useless, however; their sole purpose is to serve as placeholders when set the enumeration values for the succeeding items. Is there any good way to name such things?
Incidentally, I develop for small microcontrollers were RAM is at a greater premium than code space. Code is simulated by compiling on Microsoft VC++, but for production is compiled using a cross-compiler in straight C; code must thus compile in both C and C++.
Are there any other preprocessor tricks people can recommend for similar tasks?
It all boils down to the prefix you want to use. Ideally, one would want all the symbols to be easily associated with the list (
HP_LIST) they are related to.So why not to put the symbols under the same
HP_prefix? E.g. prefixHP__ZZQX_, to differentiate between the useful and the useless symbols.N.B. I have worked on a project where one of the shared libraries is already using (internally)
zzqx_prefix, it was always showing up in the application’s symbol table at the end. In the race for unlikely-to-be-used names, apparently many people take the same route (end of the latin alphabet) and end up with precisely same names. The opposite of the desired result. That is why I think that namespaces (or in C the symbol prefixes) should not be hidden/burried in the defines, but rather explicitly defined (e.g. easy to find and extract).And as something concrete, here is your source enhanced with the hack around
##to generate the names using the prefix given as a preprocessor define:Edit 1. My prefered approach is to put the data into a proper text file, e.g.:
(note that you do not need length anymore) and write a script (or even a trivial C program) to generate source code from the text file, creating the necessary header (with the enum + declaration of the data) and source file (with the data). Modify the Makefile to run the script before compiling any sources and add the generated source files to the list of compiled sources.
That has HUGE advantage that the generated code is a plain code and can be indexed as such (unless you love the fun of “where that darn id came from?”). The internal constants simply do not appear anymore in the source code since script handles them. And no fugly preprocessor magic anymore.