I have a small problem. I have this piece of code in Python (taken from a larger script):
for line in open(trainFile):
for token,tag in [x.rsplit('/',1) for x in line.split()]:
tokenTagCount[(token,tag)] += 1
tags[tag] += 1
listOfTags.append(tag)
The trainFile contains words and tags for Danish, but that’s not the issue. The problem is this: because the file is in Danish, I have to include # -*- coding: cp1252 -*- at the first line to properly show the characters in Python. However, my for loop (“for line in open…”) should ignore this first line about coding and start running at the second line of the trainFile, where the actual data begin. How do I do this?
Thanks!
This is how you can skip the first line:
A better option might be to skip lines that start with
#: