Yesterday I posted a similar question to this one:
Python Regex Named Groups.
This work’s pretty well for simple things.
After some researching I’ve read about the pyparsing library which seems to be pretty perfect for my tasks.
text = '[@a eee, fff fff, ggg @b eee, fff, ggg @c eee eee, fff fff,ggg ggg@d]'
command_s = Suppress(Optional('[') + Literal('@'))
command_e = Suppress(Literal('@') | Literal(']'))
task = Word(alphas)
arguments = ZeroOrMore(
Word(alphas) +
Suppress(
Optional(Literal(',') + White()) | Optional(White() + Literal('@'))
)
)
command = Group(OneOrMore(command_s + task + arguments + command_e))
print command.parseString(text)
# which outputs only the first @a sequence
# [['a', 'eee', 'fff', 'fff', 'ggg']]
# the structure should be someting like:
[
['a', 'eee', 'fff fff', 'ggg'],
['b', 'eee', 'fff', 'ggg'],
['c', 'eee eee', 'fff fff', 'ggg ggg'],
['d']
]
@ indicates the start of a sequence, the first word is a task (a) followed by optional comma-separated arguments (eee, fff fff, ggg). The problem is, that @b, @c and @d are ignored by the above code. Also “fff fff” getting treated as two separated arguments, it should only be one.
See the embedded comments.
This will print your desired output.
For extra credit, here is how to have pyparsing define keys for you:
Prints