In the Dragonbook’s exercise 3.3.1 the student should
Consult the language reference manuals
to determine (i) the set of characters
that form the input alphabet
(excluding those that may only appear
in character strings or comments […]
for each of the following languages:
[…].
It makes no real sense to me to describe really all the characters like a, b, / for a language, even if it is an exercise for compilers. Isn’t the alphabet of a programming language the set of possible words, like {id, int, float, string, if, for, ... }?
And if you consider it really beeing “characters” in the basic idea of the word, is ??/ in C one or three charaters (or both)?
The alphabet of a language is the set of characters not the words.
No, the alphabet is the set of characters that are used to form words. When an language is specified, the alphabet must be given otherwise you cannot distinguish a valid token from an invalid token.
Update
You are confusing the term “word” with “token”. A word is not some part of a language or program. A word is finite string of characters from the alphabet. It has nothing to do with a language construct like “int” or “while”. For example, each C program is a word because it is a finite string of characters from the alphabet. The set of all of these programs (words) forms the C programming language. Tokens like “void” or “int” are entirely a different thing.
To recap, you start by defining the some set of characters you want to use. This is called the alphabet. Finite strings of these characters form words. A language is some subset of all possible words. To define a language, you define which words belong to the language. For example, with a regular expression or a context-free grammar.
Wikipedia has a good page on formal languages.
http://en.wikipedia.org/wiki/Formal_language