I was trying to unify the lines in my file when I observed the following:
word1 word2
word1 word2
I did not understand why these lines were not combined so I opened the file in vim and used :set list to see if there are any special characters and I found this:
word1 <feff>word2
word1 word2
I am not sure how to clean this word in Python. Any suggestions on what character might be and how this can be cleaned?
U+FEFF is the Byte Order Mark character, which should only occur at the start of a document. In documents, it should be treated as a
ZERO WIDTH NON-BREAKING SPACE. If this causes issues, you can remove it like any other character:(In Python 3.1 or 3.2, drop the
uin front of strings)