I have a python script that analyzes a set of error messages and checks for each message if it matches a certain pattern (regular expression) in order to group these messages. For example ‘file x does not exist’ and ‘file y does not exist’ would match ‘file .* does not exist’ and be accounted as two occurrences of ‘file not found’ category.
As the number of patterns and categories is growing, I’d like to put these couples ‘regular expression/display string’ in a configuration file, basically a dictionary serialization of some sort.
I would like this file to be editable by hand, so I’m discarding any form of binary serialization, and also I’d rather not resort to xml serialization to avoid problems with characters to escape (& <> and so on…).
Do you have any idea of what could be a good way of accomplishing this?
Update: thanks to Daren Thomas and Federico Ramponi, but I cannot have an external python file with possibly arbitrary code.
You have two decent options:
The standard Python configuration files look like INI files with
[sections]andkey : valueorkey = valuepairs. The advantages to this format are:YAML is different in that it is designed to be a human friendly data serialization format rather than specifically designed for configuration. It is very readable and gives you a couple different ways to represent the same data. For your problem, you could create a YAML file that looks like this:
Or like this:
Using PyYAML couldn’t be simpler:
At this point
errorsis a Python dictionary with the expected format. YAML is capable of representing more than dictionaries: if you prefer a list of pairs, use this format:Or
Which will produce a list of lists when
yaml.loadis called.One advantage of YAML is that you could use it to export your existing, hard-coded data out to a file to create the initial version, rather than cut/paste plus a bunch of find/replace to get the data into the right format.
The YAML format will take a little more time to get familiar with, but using PyYAML is even simpler than using ConfigParser with the advantage is that you have more options regarding how your data is represented using YAML.
Either one sounds like it will fit your current needs, ConfigParser will be easier to start with while YAML gives you more flexibilty in the future, if your needs expand.
Best of luck!