I have the following test code for my PyParsing grammar:
from pyparsing import Word, nums, alphas, delimitedList, Group, oneOf
from pprint import pprint
field = Word(alphas)("field")
operator = oneOf("= +")("operator")
string_value = Word(alphas)("string")
int_value = Word(nums).setParseAction(lambda t: int(t[0]))("int")
value = (string_value | int_value )("value")
expression = Group(field + operator + value)("expression")
grammar = Group(delimitedList(expression, delim="&&"))("expr_list")
def test(s):
print "Parsing '{0}'".format(s)
tokenized = grammar.parseString(s)
for f in tokenized:
e = f.expression
pprint(dict(e.items()))
if __name__ == "__main__":
test("foo=1")
test("foo=1 && bar=2")
test("foobar=2 && snakes=4")
Output is quite unexpected – seems that I only get the last expression in tokenized:
Parsing 'foo=1'
{'field': 'foo', 'int': 1, 'operator': '=', 'value': 1}
Parsing 'foo=1 && bar=2'
{'field': 'bar', 'int': 2, 'operator': '=', 'value': 2}
Parsing 'foobar=2 && snakes=4'
{'field': 'snakes', 'int': 4, 'operator': '=', 'value': 4}
How do I fix this?
Untested, but I think you just need to change:
to:
EDIT: okay, one other change. Your iteration code looks for multiple items named ‘expression’. There are multiple items named ‘expression’ inside the ‘&&’-delimited list. It is simpler not to reference these by their name, but by iterating over the grouped expressions inside ‘expr_list’:
I usually use the
dumpmethod on parsed results to see just how the data has been grouped and named. If I print outtokenized.dump()I get:I can see that I can get at the ‘expr_list’ named value. I also see that there is a sub-level ‘expression’, but as these keys are by default unique like in a dict, there is only a value for the group that was parsed last. But I can access the multiple groups inside ‘expr_list’ – if I look at the 0’th item (using
print tokenized['expr_list'][0].dump()), I get:So I can iterate over the groups in the ‘expr_list’ using:
and I’ll get:
It isn’t necessary to put results names on every level within your grammar – in this case, we got the expressions by iterating through
expr_listand didn’t even useexpression. And in fact, if you take the Group of the outermost grammar expression, you don’t need ‘expr_list’ either, just iteratefor f in tokenized:.When trying to tease out the contents of your returned ParseResults, the
dumpmethod is probably the best tool.