I’m trying to write a parser using flex and bison but I’m confused on how it works. I’m trying to take a text file formatted in the following way:
Version Header Version 1.00 <--- File always starts with a header
Key : Value <--- Each section includes these but these after the version header are in the "Root" section
==Section Name <--- A section
$Key : Value <--- These are properties
Key : Value <--- Same thing as in the "Root" section
Sample Format:
NasuTek Licensing Version 1.00
Vendor : NASUTEKENTERPRISES
Notice : NasuTek Enterprises
License Group : NasuTek Asheila
License Name : NasuTek Asheila
Vendor Notice : NasuTek Asheila Internal Build License
Serial : ASHEL-87267-4987-3737-37821:32742
Start Date : Wed July 04 00:00:00 2012
End Date : Sat July 20 00:00:00 2013
Trial : Yes
Count : 0
Components : EXPORT
Host : Any
==Software Configuration
$Signed Section : Yes
Export Configuration : {
Supports Export to XML : Yes
Supports Export to Text : Yes
}
==Signature
vpUsQJ+Qo4OS+RQg0vuLW0mXjAj/o6v[trunicated]
How can I accomplish this as I’m confused on grouping. I can get it to see the key pairs as thats simple enough, but i dont know how to deal with the splitting using == and the {} pairs?
Okay, your grammar isn’t all that simple. But, what I had done was define a token in the lexer to treat
\n==as the section start symbol (which I calledEQEQ). So, the grammar rule looked like:And the tokenizing rule looked like:
I used a start condition in order to be able to treat the word
Signaturelike a keyword if it was right after theEQEQ, and another start condition so that a signature section would just pull in the signature data as a single text blob:The grouping rule is easiest defined in a single rule. This is the grammar I used for a property key-value pair:
And then this is the rule I used to define a
value_block:And, a
sub_propertylooks just like asection_property.Whenever a new section is encountered, your parsing code should remember which section the subsequent value pairs belong to. Likewise, when parsing a sub-property block, the enclosing property key should be saved so that the sub-properties can be appropriately assigned.
One thing that could trip you up in
yacclike parsers is its bottom up nature. As the leaf elements of a rule are recognized, save the values in the leaf rules, and in your enclosing rule, refer to the saved values. For example, this rule:saves consecutive words into a save buffer representing the word sequence. Then, in an enclosing rule, that save buffer is saved again:
Where
words_save_as_keybasically dups the saved words buffer, and then resets that buffer for a different sequence that will be saved (likely, the sequence representing the associated value).