I’m trying to read in an image URL. As mentioned in the java documentation, I tried converting the URL to URI by
String imageURL = "http://www.shefinds.com/files/Christian-Louboutin-Décolleté-100-pumps.jpg";
URL url = new URL(imageURL);
url = new URI(url.getProtocol(), url.getHost(), url.getFile(), null).toURL();
URLConnection conn = url.openConnection();
InputStream is = conn.getInputStream();
I get the a Java.io.FileNotFound Exception for file
http://www.shefinds.com/files/Christian-Louboutin-Décolleté-100-pumps.jpg
What am I doing wrong and what is the right way to encode this URL?
Update:
I’m using Rome to read in RSS feeds. Taking suggestions from BalusC I have printed out the raw input from different stages and seems like that the ROME rss parser is using ISO-8859-1 instead of UTF-8.
Works fine here (returns a 403, it’s at least not a 404):
When I fix it so that it doesn’t return a 403, the picture is correctly retireved:
So your problem lies somewhere else. Converting is actually not needed. The initial URL is valid.
Maybe you’re obtaining the actual URL from some binary source using the wrong character encoding? The transition of
étoénamely suggests that the original source was UTF-8 encoded and that the code has incorrectly read it in in using ISO-8859-1 instead of UTF-8.Update: or maybe you’ve actually hardcoded it in the Java source code and saving the source file itself using the wrong encoding. I’ve configured my editor (Eclipse) to save files using UTF-8 and the
-Dfile.encodingis also defaulted to UTF-8, that would explain why it works at my machine 😉Update 2: as per the comments, in a nutshell, everything should work fine if the encoding used to save the source file matches the default
-Dfile.encodingof the runtime platform (and the character encoding in question supports theé). To avoid those unforeseen clashes whenever you like to distribute the code, it’s indeed better to replace hardcoded non-ASCII chars by unicode escapes.