I am mapping some human-input text to part names today, and came across a case that might be of general interest. Here is the human-input text:
Seat (Dis.)
Disc seat
seats
Seat (Suc.)
Suct seat
The two parts names to map to are SEAT, DISCHARGE and SEAT, SUCTION. We will also map the ambiguous seats to SEAT, DISCHARGE. The reason for using a regex is that we can anticipate new cases in the future such as discharge seat etc.
So currently I am tackling this with two search() calls like this pseudocode:
if [Ss][Ee][Aa][Tt] matches input name:
if [Ss][Uu][Cc] matches input name, part is SEAT, SUCTION
else part is SEAT, DISCHARGE
Is there a better way to do this kind of mapping? Better would mean: more compact code, easier to tweak to handle new cases, or a better probability to handle more cases without code modification.
Rather than look for all permutations of uppercase/lowercase letters, try just lowercasing the string and searching for the lowercase version.
I’m not sure in this case that a regex is warranted, unless you have more examples of strings you need to match. However, if you really want to do it that way, you can also make the regex call case insensitive by passing the
re.Iflag to any of the regex functions.