I have a special use case which I do not yet know how to cover. I want to dissect a string based on field_name/field_length. For that I define a regex like this:
'(?P<%s>.{%d})' % (field_name, field_length)
And this is repeated for all fields.
I have also a regex to remove spaces to the right of each field:
self.re_remove_spaces = re.compile(' *$')
This way I can get each field like this:
def dissect(self, str):
data = { }
m = self.compiled.search(str)
for field_name in self.fields:
value = m.group_name(field_name)
value = re.sub(self.re_remove_spaces, '', value)
data[field_name] = value
return data
I have to perform this processing for millions of strings, so it must be efficient.
What annoys me is that I would prefer to perform the dissection + space removal in a single step, using compiled.sub instead of compiled.search, but I do not know how to do this.
Specifically, my question is:
How do I perform regex substitution combining it with named groups in Python regexes?
I take it each field sits next to each other in the string, like in a table, e.g.:
So assuming you know in advance the length of each field, you can do it much more simply, without using a regex at all. (btw,
stris not a good name for a variable since it clashes with the builtinstrtype)Then, if
fields = [('lang', 9), ('desc', 19), ('license', 12)]:Is this what you’re trying do though?