Wikipedia’s Interpolation Definition
I am just learning flex / bison and I am writing my own shell with it. I am trying to figure out a good way to do variable interpolation. My initial approach to this was to have flex scan for something like ~ for my home directory, or $myVar , and then set what the yyval.stringto what is returned using a look up function. My problem is, that this doesn’t help me when text appears one token:
kbsh:/home/kbrandt% echo ~
/home/kbrandt
kbsh:/home/kbrandt% echo ~/foo
/home/kbrandt /foo
kbsh:/home/kbrandt%
The lex definition I have for variables:
\$[a-zA-Z/0-9_]+ {
yylval.string=return_value(&variables, (yytext + sizeof(char)));;
return(WORD);
}
Then in my Grammar, I have things like:
chdir_command:
CD WORD { change_dir($2); }
;
Anyone know of a good way to handle this sort of thing? Am I going about this all wrong?
The way ‘traditional’ shells deal with things like variable substitution is difficult to handle with lex/yacc. What they do is more like macro expansion, where AFTER expanding a variable, they then re-tokenize the input, without expanding further variables. So for example, an input like “xx${$foo}” where ‘foo’ is defined as ‘bar’ and ‘bar’ is defined as ‘$y’ will expand to ‘xx$y’ which will be treated as a single word (and $y will NOT be expanded).
You CAN deal with this in flex, but you need a lot of supporting code. You need to use flex’s yy_buffer_state stuff to sometimes redirect the output into a buffer that you’ll then rescan from, and use start states carefully to control when variables can and can’t be expanded.
Its probably easier to use a very simple lexer that returns tokens like ALPHA (one or more alphabetic chars), NUMERIC (one or more digits), or WHITESPACE (one or more space or tab), and have the parser assemble them appropriately, and you end up with rules like:
as you can see, this get complex quite fast.