I’m new to Python, which I’m using to do an ugly little put-this-tabular-data-into-a-db conversion. The program looks at the data, creates a table in MySQL, and then reads the data into the table. In this section, header row text is checked to make some decision about data typing. I had an idea that I could be clever and do this with a single regex rather than if/elifs. My solution works for this case at least, where I don’t have to worry about multiple matches. What I’m asking is, is there any real merit to this approach in terms of efficiency?
def _typeMe(self, header_txt):
# data typing
colspecs = {
'id':'SMALLINT(10)',
'date':'DATE',
'comments':'TEXT(4000)',
'flag':'BIT(1)',
'def':'VARCHAR(255)'
}
# regex to match on header text e.g. 'Provisioner ID'
r = re.search(re.compile('(ID$)|(Date)|(Comments$)|(FLAG$)', re.IGNORECASE), header_txt)
checktype = lambda m: max(m.groups()).lower() if m else 'def'
return colspecs[checktype(r)]
I agree with @ecatmur’s answer; I just wanted to post some slight code suggestions that are a little too long for a comment.
There’s no need to do
re.search(re.compile('...', re.IGNORECASE), header_text). Instead, you can just pass the string straight in asre.search('...', header_text, re.IGNORECASE). If you’re using the same regex over and over,re.compileis faster, butre.searchand friends will call it for you if you didn’t.Though I don’t share Colin’s disdain for named lambdas (it can be handy just because they’re still one line instead of two), you don’t need an inner function here at all:
The
max(m.groups())trick also isn’t necessary if you just make one capturing group instead of four:'(ID|Date|Comments|Flag)$'. Then you can dom.group(1).