I am a python beginner and want python to capture all text in quotation marks from a text file. I have tried the following:
filename = raw_input("Enter the full path of the file to be used: ")
input = open(filename, 'r')
import re
quotes = re.findall(ur'"[\^u201d]*["\u201d]', input)
print quotes
I get the error:
Traceback (most recent call last):
File "/Users/nithin/Documents/Python/Capture Quotes", line 5, in <module>
quotes = re.findall(ur'"[\^u201d]*["\u201d]', input)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 177, in findall
return _compile(pattern, flags).findall(string)
TypeError: expected string or buffer
Can anyone help me out?
As Bakuriu has pointed out, you need to add
.read()like so:open()merely returns a file object, whereasf.read()will return a string. In addition, I’m guessing you are looking to get everything between two quotation marks instead of just zero or more occurences of[\^u201d]before a quotation mark. So I would try this:The
re.Uaccounts for unicode. Or (if you don’t have two sets of right double quotation marks and don’t need unicode):Finally, you may want to choose a different variable than
input, sinceinputis a keyword in python.Your result might look something like this: