In a text file I’m processing, I have characters like ��. Not sure what

Question

0

Editorial Team

Asked: June 6, 20262026-06-06T18:58:39+00:00 2026-06-06T18:58:39+00:00

In a text file I’m processing, I have characters like ��. Not sure what

0

In a text file I’m processing, I have characters like ��. Not sure what they are.

I’m wondering how to remove/convert these characters.

I have tried to convert it into ascii by using .encode(‘ascii’,’ignore’). python told me char is not whithin 0,128

I have also tried unicodedata, unicodedata.normalize(‘NFKD’, text).encode(‘ascii’,’ignore’), with the same error

Anyone help?

Thanks!

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-06T18:58:40+00:00

You can always take a Unicode string an use the code you showed:

my_ascii = my_uni_string.encode('ascii', 'ignore')

If that gave you an error, then you didn’t really have a Unicode string to begin with. If that is true, then you have a byte string instead. You’ll need to know what encoding it’s using, and you can turn it into a Unicode string with:

my_uni_string = my_byte_string.decode('utf8')

(assuming your encoding is UTF-8).

This split between byte string and Unicode string can be confusing. My presentation, Pragmatic Unicode, or, How Do I Stop The Pain can help you to keep it all straight.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

In a text file I’m processing, I have characters like ����. Not sure what

Leave an answerCancel reply

1 Answer

In a text file I’m processing, I have characters like ��. Not sure what

Leave an answer
Cancel reply