Problem: Arabic words in my text files read by java show as series of question marks : ??????
Here is the code:
File[] fileList = mainFolder.listFiles();
BufferedReader bufferReader = null;
Reader reader = null;
try{
for(File f : fileList){
reader = new InputStreamReader(new FileInputStream(f.getPath()), "UTF8");
bufferReader = new BufferedReader(reader);
String line = null;
while((line = bufferReader.readLine())!= null){
System.out.println(new String(line.getBytes(), "UTF-8"));
}
}
}
catch(Exception exc){
exc.printStackTrace();
}
finally {
//Close the BufferedReader
try {
if (bufferReader != null)
bufferReader.close();
} catch (IOException ex) {
ex.printStackTrace();
}
As you can see I have specified the UTF-8 encoding in different places and still I get question marks, do you have any idea how can I fix this??
Thanks
Replace
by
The
String#getBytes()without the charset argument namely uses platform default encoding to get the bytes from the string, which may not be UTF-8 per se. You’re already reading the bytes as UTF-8 byInputStreamReader, so you don’t need to massage it forth and back afterwards.Further, ensure that your display console (where you’re reading those lines) supports UTF-8. In for example Eclipse, you can do that by Window > Preferences > General > Workspace > Text File Encoding > Other > UTF-8.
See also: