I am wondering if all programming language reserve keys words? SaysIf,While are reserved key words. We should not use it as ordinary variable or function name say if I have If = 3 is illegal. So compiler will generate error during sanner phase. What if a language allow programmer use reserved keywords say If as variable name or function name. How do the compiler can handle this? Does this get handled in the scanner or parser? What should semantic analysis do?
update:
I understand this is not a good practice but the real reason for most/all programming language not support this is because scanner or parser cannot do acurately scanning the language or parsing the language OR what it is really behind scenes? Thanks.
You definitely could do such a thing, but obviously it would destroy the intuitiveness of the source code. Imagine this:
As far as actually implementing it, the lexer wouldn’t have to be changed at all. If the lexer matches “if” in the source it returns a token with an
IFtype. Suppose we have the following assignment statement, whereifis a variable name and it’s getting assigned the value 1.The lexer’s token stream to be fed to the parser is:
I might have the following productions to describe an assignment statement (\w integer rvals):
LARROW,ID,IF,INTLITERAL, andSEMICOLONare terminals, which are tokens returned by the lexer, andassignStmt,id, andintExprare non-terminals.IDrepresents an identifier (e.g. class/variable/method name).After failing the production for an if statement, we’ll eventually enter the first production for an assignment statement. We expand the
idnon-terminal, whose only production isID, but the token I want to match isIF, so theassignStmtproduction fails altogether.For my language to allow a variable to be named “if” all I have to do is:
Note that
|defines an alternate production for the non-terminal. Now we have that second production for theidnon-terminal, which matches the current token, and ultimately results in matching an assignment statement.AssignmentStatementis an AST node defined as follows:Once the parser decides the source is syntactically correct, nothing else should be affected. The names of your variables shouldn’t affect the latter stages of compilation, that is if you don’t create conditions that would allow that to happen.