I have a string
input = "maybe (this is | that was) some ((nice | ugly) (day |night) | (strange (weather | time)))"
How is the best method in Ruby to parse this string ?
I mean the script should be able to build sententes like this :
maybe this is some ugly night
maybe that was some nice night
maybe this was some strange time
And so on, you got the point…
Should I read the string char by char and bulid a state machine with a stack to store the parenthesis values for later calculation, or is there a better approach ?
Maybe a ready, out of the box library for such purpose ?
Try Treetop. It is a Ruby-like DSL to describe grammars. Parsing the string you’ve given should be quite easy, and by using a real parser you’ll easily be able to extend your grammar later.
An example grammar for the type of string that you want to parse (save as
sentences.treetop):The grammar above needs an accompanying file that defines the classes that allow us to access the node values (save as
sentence_nodes.rb).The following example program shows that it is quite simple to parse the example sentence that you have given.
The output of this program is:
You can also access the syntax tree:
The output is here.
There you have it: a scalable parsing solution that should come quite close to what you want to do in about 50 lines of code. Does that help?