So I’m trying to get better at python in general but I’m having some trouble using the re module for regular expressions.
I have a comma separated csv file that I’m reading in, and then I want to find all occurrences of a line ending in a comma 5. So I used the code below:
five_rating = re.compile(r",5$", re.MULTILINE)
print five_rating.findall(file.read())
but I don’t get any output. There are definitely occurrences that match the regular expression I’m using, I’ve tested my regex on python regex websites and they model what I want, but in code, it just doesn’t work!
Is there something obvious I’m doing wrong here?
Oh and I’m using Ubuntu and the file should have DOS style line endings, but I tried converting the end-line characters using the code from this post and it didn’t do the trick.
btw here’s a sample of the input:
9605,Ace Ventura: Pet Detective,5
9606,Ace Ventura: Pet Detective,1
9607,Ace Ventura: Pet Detective,4
9608,Ace Ventura: Pet Detective,3
9609,Ace Ventura: Pet Detective,2
9610,Ace Ventura: Pet Detective,4
9611,Ace Ventura: Pet Detective,3
9612,Ace Ventura: Pet Detective,4
9613,Ace Ventura: Pet Detective,5
9614,Ace Ventura: Pet Detective,5
9615,Ace Ventura: Pet Detective,4
9616,Ace Ventura: Pet Detective,1
9617,Ace Ventura: Pet Detective,3
9618,Ace Ventura: Pet Detective,4
9619,Ace Ventura: Pet Detective,3
9620,Ace Ventura: Pet Detective,1
9621,Ace Ventura: Pet Detective,2
9622,Ace Ventura: Pet Detective,3
9623,Ace Ventura: Pet Detective,5
9624,Ace Ventura: Pet Detective,2
9625,Ace Ventura: Pet Detective,2
9626,Ace Ventura: Pet Detective,4
9627,Ace Ventura: Pet Detective,3
9628,Ace Ventura: Pet Detective,1
Given you input (which could be a file) as a multiline string, like this:
This works:
or a
re.findallversion:or (somewhat confusingly IMHO) re.findall will work without parens:
Yours is not working because of no
.*meaning ‘match everything up to the ‘,5$’Also as stated in one of the comments, using
fileas a identifier is a bad idea.You can also use Python’s string processing to do this:
And if you really have a CSV file to process — use the builtin CSV module.
Finally — if you have a DOS file on *nix, just use Python’s universal line support by using open with ‘U’ in it: