I’m new to Python, which I’m using to do an ugly little put-this-tabular-data-into-a-db conversion.

Question

0

Asked: June 8, 20262026-06-08T18:45:13+00:00 2026-06-08T18:45:13+00:00

I’m new to Python, which I’m using to do an ugly little put-this-tabular-data-into-a-db conversion.

0

I’m new to Python, which I’m using to do an ugly little put-this-tabular-data-into-a-db conversion. The program looks at the data, creates a table in MySQL, and then reads the data into the table. In this section, header row text is checked to make some decision about data typing. I had an idea that I could be clever and do this with a single regex rather than if/elifs. My solution works for this case at least, where I don’t have to worry about multiple matches. What I’m asking is, is there any real merit to this approach in terms of efficiency?

def _typeMe(self, header_txt):
    # data typing
    colspecs = {
        'id':'SMALLINT(10)', 
        'date':'DATE', 
        'comments':'TEXT(4000)',
        'flag':'BIT(1)', 
        'def':'VARCHAR(255)'
    }
    # regex to match on header text e.g. 'Provisioner ID'
    r = re.search(re.compile('(ID$)|(Date)|(Comments$)|(FLAG$)', re.IGNORECASE), header_txt)
    checktype = lambda m: max(m.groups()).lower() if m else 'def'
    return colspecs[checktype(r)]

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-08T18:45:15+00:00

I agree with @ecatmur’s answer; I just wanted to post some slight code suggestions that are a little too long for a comment.

There’s no need to do re.search(re.compile('...', re.IGNORECASE), header_text). Instead, you can just pass the string straight in as re.search('...', header_text, re.IGNORECASE). If you’re using the same regex over and over, re.compile is faster, but re.search and friends will call it for you if you didn’t.

Though I don’t share Colin’s disdain for named lambdas (it can be handy just because they’re still one line instead of two), you don’t need an inner function here at all:

return colspecs[max(m.groups()).lower() if m else 'def']

The max(m.groups()) trick also isn’t necessary if you just make one capturing group instead of four: '(ID|Date|Comments|Flag)$'. Then you can do m.group(1).

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m new to Python, which I’m using to do an ugly little put-this-tabular-data-into-a-db conversion.

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply