I’m looking at implementing a C preprocessor in two phases, where the first phase

Question

0

Asked: June 17, 20262026-06-17T14:42:46+00:00 2026-06-17T14:42:46+00:00

I’m looking at implementing a C preprocessor in two phases, where the first phase

0

I’m looking at implementing a C preprocessor in two phases, where the first phase converts the source file into an array of preprocessing tokens. This would be good for simplicity and performance, as the work of tokenizing would not need to be redone when a header file is included by multiple files in a project.

The snag:

#define f(x) #x
main() {
    puts(f(a+b));
    puts(f(a + b));
}

According to the standard, the output should be:

a+b
a + b

i.e. the information about whether constituent tokens were separated by whitespace is supposed to be preserved. This would require the two-phase design to be scrapped.

The uses of the # operator that I’ve seen so far don’t actually need this, e.g. assert would still work fine if the output were always a + b regardless of whether the constituent tokens were separated by whitespace in the source file.

Is there any existing code anywhere that does depend on the exact behavior prescribed by the standard for this operator?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-17T14:42:48+00:00

Editorial Team

2026-06-17T14:42:48+00:00Added an answer on June 17, 2026 at 2:42 pm

You might want to look at the preprocessor of the LCC compiler, written as an example ANSI C compiler for compiler courses. Another preprocessor is MCPP.

C/C++ preprocessing is quite tricky, if you stick to it make sure to get at least drafts of the relevant standards, and pilfer test suites somewhere.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m looking at implementing a C preprocessor in two phases, where the first phase

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply