We have a MySQL table with fields using latin1_swedish_ci however there is some UTF-8

Question

0

Asked: June 8, 20262026-06-08T21:13:58+00:00 2026-06-08T21:13:58+00:00

We have a MySQL table with fields using latin1_swedish_ci however there is some UTF-8

0

We have a MySQL table with fields using latin1_swedish_ci however there is some UTF-8 encoded data in there.

When I change the collation from latin1_swedish_ci to utf8_general_ci it converts the ISO-8859-1 stuff fine, but the UTF-8 data gets broken, seeing things like… youâ€™re.

Any ideas?

Edit: we are using InnoDB with MySQL 5.1

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-08T21:13:59+00:00

So, your “latin1” column had some data encoded in UTF-8, but not everything, and now you have some data doubly-encoded in UTF-8?

You could do this to fix the doubly encoded values: first convert back to latin1, re-interprete the string as binary, and tell MySQL that the binary string is actually text encoded in UTF-8. In MySQL this can be written as:

convert(binary convert(mycolumn using latin1) using utf8)

Now the problem is detecting which strings are doubly encoded, so that you can update only those. This can be done by comparing the length of the original string in characters with the length of the new string in bytes; for doubly-encoded text they should be equal. This is how you end up with:

update mytable set mycolumn = @str where char_length(mycolumn) = 
    length(@str := convert(binary convert(mycolumn using latin1) using utf8));

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

We have a MySQL table with fields using latin1_swedish_ci however there is some UTF-8

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply