I select data from mysql, the database is not in utf8 (the unicode character is save as latin, for example the unicode string Đỗ Tiến(correct form) is save as Äá»— Tiến). If I use PHP to echo to html, I just set <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> the webpage displays correctly.
If I do not set the meta tag, when open by Chrome, the Chrome detect that is in windows-1258 encode, manually change to Unicode (utf-8), the webpage displays correctly.
The problem is: when I select data from mysql using jdbc I convert like this:
byte[] asciiBytes1 = "Äá»— tiến".getBytes("Cp1258");
byte[] asciiBytes2 = "Äá»— tiến".getBytes("ISO-8859-1");
String unicode1 = new String(asciiBytes1, "UTF-8");
String unicode2 = new String(asciiBytes2, "UTF-8");
System.out.println(unicode1);//�?ỗ tiến
System.out.println(unicode2);//Đ�? tiến
as the result, java does not convert properly, I try many encodings in http://docs.oracle.com/javase/1.4.2/docs/guide/intl/encoding.doc.html, not only Cp1258 and ISO-8859-1, but none works.
The 2 simple method to converting is use html file with Äá»— tiến string as I mention before or using notepad++, set encoding ANSI, paste Äá»— tiến string then change to utf-8, it will displays Đỗ Tiến(is the correct string I want)
That’s kinda complicated, it’s in modified Windows-1252 where 0x81, 0x8d, 0x8f, 0x90 and 0x9d that are normally
not assigned are replaced with respective C1 characters. It seems Java doesn’t take this into account by default
when using Windows-1252.
It is easiest to just fix your database and use UTF-8 everywhere.
Here’s the code anyway
Here’s the opposite:
You probably want to stash the map and array somewhere instead of creating them when the methods are called