Migrating Data from MySQL server1 to MySQL server2
server1 Ver 14.12 Distrib 5.0.51a, for debian-linux-gnu (x86_64) using readline 5.2
mysql> SHOW VARIABLES LIKE 'character_set%';
+--------------------------+------------------------------------------+
| Variable_name | Value |
+--------------------------+------------------------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /data/mysql/gabino/share/mysql/charsets/ |
+--------------------------+------------------------------------------+
8 rows in set
server2 Ver 14.12 Distrib 5.0.90, for pc-linux-gnu (x86_64) using readline 6.0
mysql> SHOW VARIABLES LIKE 'character_set%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set
Server1 MySQL is the backend of a WordPress blog, everything works fine from the frontend, until I (the unlucky guy) has to migrate data so I logged into PhpMyAdmin and MySQL console. Now from the backend it seems that every east-Asian character in server1 is messed up, either in SELECT queries in console or mysqldump files. The symptom is, for example the Chinese character 看 turned into three latin1 characters 看, which is the same result SELECT _latin1'看'. The UTF8 presentation of 看 is \xe7\x9c\x8b so MySQL somehow directly displayed each byte as individual latin1 character instead of rendering 3 bytes as a Chinese character.
Even if I use the ‘Data Transfer’ function in Navicat 8 to copy two database from server1 to server2 identically, the new blog running on server2 get messed up characters. I tried various methods like SET NAMES utf8 etc. and still can not get it done.
So how can I tell/force server1 MySQL to handle the latin1 characters as utf8 and get them displayed and dumped correctly?
Do a hex dump (ie:
SELECT HEX(columnname) FROM table) on both servers and see if the data is the same. If it is, then you’ll know that at least the data didn’t get corrupted.In this case, you just need to set the correct charset and collation for the server(s). If not, you’ll probably have to re-do the data transfer, and this time around make sure the settings are correct.
Another thing is make sure the browser’s encoding is set to utf-8.
EDIT: So, data did get corrupted in the transfer.
C3A7C593E280B9is the UTF-8 representation of看. This is probably because server1 is sending data as latin1, and server2 encodes that into UTF-8.You have to change the connection settings on server1 before transferring data. To do that, run these queries:
Then try the data transfer again.
EDIT 2: Based on what you said, here’s what I think is happening. The data sitting on your database is encoded in UTF-8. When PHP (WordPress) fetches this data, it “thinks” it’s encoded in latin1 (ISO-8859-1), which is (unfortunately) what PHP uses by default. PHP goes on to serve this data to the user’s browser as if it was encoded in latin1, but sets the character encoding as UTF-8, and the user sees what he’s supposed to see.
In short, it’s a case of two wrongs making a right. You now have two options:
Fix the data. (ie: read it as UTF-8 and write it back as latin1)
Set server2’s connection settings to the same as server1, which will result in data still being displayed correctly.