I keep getting this error while reading a text file. Is it possible to handle/ignore it and proceed?
UnicodeEncodeError: ‘charmap’ codec can’t decode byte 0x81 in position
7827: character maps to undefined.
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
In Python 3, pass an appropriate
errors=value (such aserrors=ignoreorerrors=replace) on creating your file object (presuming it to be a subclass ofio.TextIOWrapper— and if it isn’t, consider wrapping it in one!); also, consider passing a more likely encoding thancharmap(when you aren’t sure,utf-8is always a good place to start).For instance:
In Python 2, the
read()operation simply returns bytes; the trick, then, is decoding them to get them into a string (if you do, in fact, want characters as opposed to bytes). If you don’t have a better guess for their real encoding:…to replace unhandled characters, or
to simply ignore them.
That said, finding and using their real encoding (rather than guessing
utf-8) would be preferred.