I’m trying to read weather information from the Google Weather API.
My Code looks similar to this:
String googleWeatherUrl = "http://www.google.de/ig/api?weather=berlin&hl=de";
InputStream in = null;
String xmlString = "";
String line = "";
URL url = null;
try {
url = new URL(googleWeatherUrl);
in = url.openStream();
BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(in, UTF_8));
while ((line = bufferedReader.readLine()) != null) {
xmlString += line;
}
} catch (MalformedURLException e) {
} catch (IOException e) {
}
DocumentBuilder builder = null;
Document doc = null;
try {
builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
InputSource source = new InputSource(new StringReader(xmlString));
doc = builder.parse(source);
} catch (ParserConfigurationException e) {}
catch (FactoryConfigurationError e) {}
catch (SAXException e) {} catch (IOException e) {}
Basically it works like a charme but when the returned data contains umlauts (ö,ü,ä,…) then those characters are not displayed properly. In Eclipse as well as in the browser or in the corresponding source code they are displayed as rectangles (or something similar strange).
Actually already the variable xmlString contains the corrupted umlauts.
Does anybody have an idea on that?
Thanks and best regards,
Paul
Welcome to the magical world of Character Encodings. Please leave your sanity on the rack by the door…
You most likely need to use
source.setEncoding(encoding)and specify the correct character encoding for the web page – if you’re lucky the encoding might actually be specified in the headers.Change your inputstream’s encoding to “Latin1” like so:
BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(in, Charset.forName("Latin1")));This returns proper german characters when tested on my machine:
<current_conditions><condition data="Meistens bewölkt"/>