I have tried something like this in my Bison file…
ReturnS: RETURN expression {printf(";")}
…but the semicolon gets printed AFTER the next token, past this rule, instead of right after the expression. This rule was made as we’re required to convert the input file to a c-like form and the original language doesn’t require a semicolon after the expression in the return statement, but C does, so I thought I’d add it manually to the output with printf. That doesn’t seem to work, as the semicolon gets added but for some reason, it gets added after the next token is parsed (outside the ReturnS rule) instead of right when the expression rule returns to ReturnS.
This rule also causes the same result:
loop_for: FOR var_name COLONEQUALS expression TO {printf("%s<=", $<chartype>2);} expression STEP {printf("%s+=", $<chartype>2);} expression {printf(")\n");} Code ENDFOR
Besides the first two printf’s not working right (I’ll post another question regarding that), the last printf is actually called AFTER the first token/literal of the “Code” rule has been parsed, resulting in something like this:
for (i=0; i<=5; i+=1
a)
=a+1;
instead of
for (i=0; i<=5; i+=1)
a=a+1;
Any ideas what I’m doing wrong?
Probably because the grammar has to look-ahead one token to decide to reduce by the rule you show.
The action is executed when the rule is reduced, and it is very typical that the grammar has to read one more token before it knows that it can/should reduce the previous rule.
For example, if an expression can consist of an indefinite sequence of added terms, it has to read beyond the last term to know there isn’t another ‘+’ to continue the expression.
After seeing the Yacc/Bison grammar and Lex/Flex analyzer, some of the problems became obvious, and others took a little more sorting out.
$$to ensure that rules had the necessary information. Keywords for the most part did not need a value; things like variable names and numbers do.The prototype solution returned had a major memory leak because it used
strdup()liberally and didn’t usefree()at all. Making sure that the leaks are fixed – possibly by using a char array rather than a char pointer for YYSTYPE – is left to the OP.