In Rubular, I have created a regular expression:
(Prerequisite|Recommended): (\w|-| )*
It matches the bolded:
Recommended: good comfort level with computers and some of the arts.
Summer. 2 credits. Prerequisite:
pre-freshman standing or permission of
instructor. Credit may not be applied
toward engineering degree. S-U
grades only.
Here is a use of the regex in Python:
note_re = re.compile(r'(Prerequisite|Recommended): (\w|-| )*', re.IGNORECASE)
def prereqs_of_note(note):
match = note_re.match(note)
if not match:
return None
return match.group(0)
Unfortunately, the code returns None instead of a match:
>>> import prereqs
>>> result = prereqs.prereqs_of_note("Summer. 2 credits. Prerequisite: pre-fres
hman standing or permission of instructor. Credit may not be applied toward engi
neering degree. S-U grades only.")
>>> print result
None
What am I doing wrong here?
UPDATE: Do I need re.search() instead of re.match()?
You want to use
re.search()because it scans the string. You don’t wantre.match()because it tries to apply the pattern at the start of the string.Also, if you want to match past the first period following the word “instructor” you’re going to have to add a literal ‘.’ into your pattern:
I would suggest you make your pattern greedier and match on the rest of the line, unless that’s not really what you want, although it seems like you do.
The previous pattern with the addition of literal ‘.’, returns the same as
.*for this example.