I have been looking at flex and bison tutorials online trying to solve my problem by they all use very simple examples and mine is more complicated. I need to parse a file which may contain input which looks like this:
f(x,g(x))
These functions may also have an arbitrary number of arguments.
The problem is that I need both f and g to be treated as functions by the parser and not have f as a function and g as a parameter of x. in other words I need output that looks like this:
[f,x,[g,x]]
and not like:
[f, x, g(x)]
Could someone tell me how to best do this and possibly provide the regex (since I’m not that good with them)?
At the lexical (flex) level, you would recognize four tokens as identifiers: f, x, g, and x. At the syntax (bison) level, you would recognize g(x) and f(x, g(x)) as expressions. Very schematically:
This little example will just give you the flavor of the difference between recognizing tokens and parsing.
You could also parse arguments as:
There are some subtle differences between the two, which might or might not be relevant to your problem.
The regular expression to recognize an identifier at the lexical level is whatever you like. Perhaps
in other words, a letter followed by optional digits and letters.
A good book to start with would be John Levine’s lex & yacc. I have not used his flex & bison, but I would recommend it on the strength of the earlier book.