How is the semantic analysis done by a compiler (generally)?
I had to answer to this question during my last exam, it wasn’t enough for the professor.
I included BNF (with an example) and syntactic cards in my answer, to which he asked me: “What happens when the compiler finds a statement like int i;?”
Time to read Aho&Ullman/Dragon book carefully.
Semantic analysis is the activity of a compiler to determine what the types of various values are, how those types interact in expressions, and whether those interactions are semantically reasonable. For instance, you can’t reasonably multiply a string by class name, although no editor will stop you from writing
To do this, the compiler must first identify declarations and scopes, and typically records the result of this step in a set of symbol tables. This tells it what specific identifiers means in specific contexts. It must also determine the types of various literal constants; “abc” is a different type than 12.2e-5.
Then it must visit all locations where identifiers and literals are used, and verify that the use of the identifier/literal, and the results computed, are compatible with the language definition (as in the above example).
As to how this is done: typically the source code is parsed, some representation of the program is constructed (syntax trees are very popular), and that representation is walked (“visited”) element by element to collect/validate the semantic information. The symbol table is usually just a set of hash tables associated with the syntax tree representing a scope, hashing from identifiers to structures containing type declarations.