We’re writing a compiler for Al Aho’s compilers class, and we’re considering the following code for the generation of our AST. Here is some background. We want to implement scoping rules as a stack of name-id mappings, and we want to push a set of mappings onto the stack before we go in and generate the nodes for the declarations.
compound_statement : {pushScope();} statement_list {popScope();}
So then, here is my question. How does this work? When will this code get executed? Does it get executed when this production is reduced by the parser? Which part happens when? Should I just go to office hours to find out?
Your question talks about building AST nodes, but the body of your explanation talks apparantly about symbol tables. These ideas are not the same! The AST represents the structure of the program. Symbol tables represent inferences about what names are visible where, and what types they have.
Following your symbol table focus, your notion of pushing the current scope as you “enter” a block, and popping it as you “exit”, is conceptually right, as it abstractly achieves new-scope-per-block.
I don’t think you can make YACC do what you said, as I’m not sure you can attach a semantic action at any point in a grammar rule. I believe you can only attach actions to the rule as a whole, and that action will only run when the rule has been recognized (“reduced”). So if you really wanted to do this, you’d want to bend the grammar to create opportunities to insert semantic actions. You can do this by rewriting your rules (following your style, I don’t think this is actually valid YACC syntax):
I added actions to block start and end symmetrically, but you can be a little more, um, parsimonious (grin):
The real secret here was creating a reduction/semantic action execution opportunity after you enter the block, by adding a sub-rule to the original rule. I’ve often done this using an empty rule:
Having shown sort of how to do this, I don’t think you want to do this at all. What you are doing is tangling semantics with the parsing process. If you go down this path, you’ll find yourself decorating the rest of the grammar with complex actions for creating/lookup up identifiers. It is generally better to use the semantic actions to simply build a syntax tree, and then after parsing is complete, walk the syntax tree to implement your symbol table construction/identifier lookup.
I’d go to office hours and ask as many questions as you can think of, whether you thought they were dumb or not. It will pay off handsomely.