Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6714691
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 26, 20262026-05-26T08:31:30+00:00 2026-05-26T08:31:30+00:00

I have a MySQL with strings that I left dormant for a while. Now

  • 0

I have a MySQL with strings that I left dormant for a while. Now that I picked it up again, I noticed that all the special characters are screwed up. My ISP has ported the server to a different machine, I suspect that this might be when it happened.

The database was populated by a PHP script. Everything was supposed to be in UTF-8, that’s what the database is set to.

However, this is what a string looks like now:

fête

Those four special characters are supposed to be one character, ê, the string is meant to be fête.

Now it looks like this is just re-encoded twice, but that doesn’t seem right. Those four characters in hex are:

C3 83 C6 92 C3 82 C2 AA

This looks very much like UTF-8, so if we decode it, we get

C3 3F C2 AA

This isn’t quite UTF-8 (because of the 3F), but let’s decode it again:

FF AA

This is not UTF-8.

The ê character is EA, in UTF-8, that would be C3 AA.

Another example: The Spanish upside-down question mark (¿) is there as C8 83 E2 80 9A C3 82 C2, which decodes to C3 3F 82 BF, which isn’t proper UTF-8 again (translates to FF 82 BF). The expected character for ¿ is BF, i.e. C2 BF in proper UTF-8.

What happened here? How did the characters get messed up? More importantly, how do I fix it?

(Side note – the new server requires me to write mysql_set_charset("utf8"); or else strings get messed up too, although in the “UTF-8 as latin1” fashion, not in this weird fashion as seen above.)

TL;DR:

  • MySQL database was populated in UTF-8 through PHP script
  • Lay dormant for years, server got migrated.
  • Now characters are messed up, see above.
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-26T08:31:31+00:00Added an answer on May 26, 2026 at 8:31 am
    C3 83 C6 92 C3 82 C2 AA
    

    This looks very much like UTF-8, so if we decode it, we get

    C3 3F C2 AA
    

    That’s what you get if you treat the sequence of bytes as UTF-8, then encode it as ISO-8859-1. 3F is ?, which has been included as a replacement character, because UTF-8 C6 92 is U+0192 ƒ which does not exist in ISO-8859-1. But it does exist in Windows code page 1252 Western European, an encoding very similar to ISO-8859-1; there, it’s byte 0x83.

    C3 83 C2 AA
    

    Go through another round of treat-as-UTF-8-bytes-and-encode-to-cp1252 and you get:

    C3 AA
    

    which is, finally, UTF-8 for ê.

    Note that even if you serve a non-XML HTML page explicitly as ISO-8859-1, browsers will actually use the cp1252 encoding, due to nasty historical reasons.

    Unfortunately MySQL doesn’t have a cp1252 encoding; latin1 is (correctly) ISO-8859-1. So you won’t be able to fix up the data by dumping as latin1 then reloading as utf8 (twice). You’d have to process the script with a text editor that can save as either (or eg in Python file(path, 'rb').read().decode('utf-8').encode('cp1252').decode('utf-8').encode('cp1252')).

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a website that has approx 1000 different strings in a mysql database
I have mysql table that has a column that stores xml as a string.
I have two MySQL tables, and I want to find and replace text strings
I have MySQL running such that I can open a client command line and
I have a table in a MySQL database that I am running simple SELECT
I have a mysql query that pulls into excel via an ODBC connection. This
I'm terribly bad at keeping MySQL queries straight, but that aside I have one
I have a SQL Server query that I need to convert to MySQL. I've
I have a ajax call to a php script that updates my MySQL DB
the deal is this: I have a MySQL database that is built in this

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.