I am learning Python, and need to format “From” fields received from IMAP. I tried it using str.find() and str.strip(), and also using regex. With find(), etc. my function runs quite a bit faster than with re (I timed it). So, when is it better to use re? Does anybody have any good links/articles related to that? Python documentation obviously doesn’t mention that…
I am learning Python, and need to format From fields received from IMAP. I
Share
findonly matches an exact sequence of characters, while a regular expression matches a pattern. Naturally only looking an for exact sequence is faster (even if your regex pattern is also an exact sequence, there is still some overhead involved).As a consequence of the above, you should use
findif you know the exact sequence, and a regular expression (or something else) when you don’t. The exact approach you should use really depends on the complexity of the problem you face.As a side note, the python
remodule provides acompilemethod that allows you to pre-compile a regex if you are going to be using it repeatedly. This can substantially improve speed if you are using the same pattern many times.