I am getting a UnicodeDecodeError: ‘utf8’ codec can’t decode bytes… invalid start byte. I

Question

0

Editorial Team

Asked: June 15, 20262026-06-15T16:06:42+00:00 2026-06-15T16:06:42+00:00

I am getting a UnicodeDecodeError: ‘utf8’ codec can’t decode bytes… invalid start byte. I

0

I am getting a UnicodeDecodeError: ‘utf8’ codec can’t decode bytes… invalid start byte.

I suspect it has to do with one of the values in my dictionary. To access all fields and put them into a dict, I use:

        mydictionary = {x:y for x,y in zip(column, values)}

What could I change to make it so that I can guarantee that the values could be converted into some way that is utf8 compliant or to avoid this error?

column contains all column headers… values contains a tuple with all values that correspond to the column

i.e.
column = (‘NAME’, HOBBY’)
values = (‘George’, ‘Basketball’)

The issue I am having is that somewhere in values, there is something going on thats like:
values = (‘-insert strange utf8 noncompliant character-George’, ‘Basketball’)

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-15T16:06:43+00:00

If you don’t care about the exact content of the bad values, you can simply tell the UTF-8 codec to ignore errors,

import codecs
codec = codecs.lookup('utf-8')
mydictionary = {codec.decode(x, 'ignore'): codec.decode(y, 'ignore') for x,y in zip(column, values)}

Alternatively, replacing 'ignore' with 'replace' will cause the codec to replace any misformed characters with the Unicode “replacement character” code point (U+FFFD). If you are only concerned about misformed strings in values, you can obvious omit the decode call on the key.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am getting a UnicodeDecodeError: ‘utf8’ codec can’t decode bytes… invalid start byte. I

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply