I have written a application that parses the html code of some web pages. My problem is with inserting that data into my mysq database. So for example i want to insert ľščťžýáíé and when i look into the table i get ?š??žýáíé.
I guess the problem could be that the html pages i’m downloading are encoded in cp1250. but the database is utf8.
BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream(),"cp1250"));
and this is how i download the data.
Do you have some ideas how to fix this problem? Because i allready ran out.
Edit: oh and when i write the data out to the console (with System.out, i know i shouldn’t use it… 🙂 ) then every character is showing up correctly.
So i found out what works.
As i’m connecting to via JDBC to MySQL i used the following connection string
And this did the trick. I had to force JDBC to use utf8 for the connection using
?useUnicode=true&characterEncoding=utf8