I am trying to read from a file and match for a certain combination of strings. PFB my program:
def negative_verbs_features(filename):
# Open and read the file content
file = open (filename, "r")
text = file.read()
file.close()
# Create a list of negative verbs from the MPQA lexicon
file_negative_mpqa = open("../data/PolarLexicons/negative_mpqa.txt", "r")
negative_verbs = []
for line in file_negative_mpqa:
#print line,
pos, word = line.split(",")
#print line.split(",")
if pos == "verb":
negative_verbs.append(word)
return negative_verbs
if __name__ == "__main__":
print negative_verbs_features("../data/test.txt")
The file negative_mpqa.txt consists of word, part-of-speech tag pairs separated by a comma(,). Here’s a snippet of the file:
abandoned,adj
abandonment,noun
abandon,verb
abasement,anypos
abase,verb
abash,verb
abate,verb
abdicate,verb
aberration,adj
aberration,noun
I would like create a list of all words in the file which has verb as it’s part-of-speech. However, when I run my program and the list returned (negative_verbs) is always empty. The if loop wasn’t executing. I tried printing word,pos pair by uncommenting the line print line.split(“,”) PFB a snippet of the ouput.
['wrongful', 'adj\r\n']
['wrongly', 'anypos\r\n']
['wrought', 'adj\r\n']
['wrought', 'noun\r\n']
['yawn', 'noun\r\n']
['yawn', 'verb\r\n']
['yelp', 'verb\r\n']
['zealot', 'noun\r\n']
['zealous', 'adj\r\n']
['zealously', 'anypos\r\n']
I understand my file may have some special characters like newline and return feed at the end of every line. I just want to ignore them and build my list. Kindly let me know how to proceed.
PS: I am newbie in Python.
You said the file has lines like this:
abandoned,adjso those areword, pospairs. But you wrotepos, word = line.split(",")which means thatpos == 'abandoned'andword == 'adj'… I think it’s clear why the list will be empty now 🙂