From the code below i read a text file that contains a character ‘a’ (unicode 97)
int ini ;
// Buffered Reader Text file read per character
while((ini=jer.read())!=(-1)){
char inp = (char)ini;
System.out.println(inp);
if (listahan.containsKey(inp)) {
listahan.put(inp,listahan.get(inp) + 1);
} else {
listahan.put(inp, 1);
}
}
// ENHANCED FOR LOOP FOR DISPLAYING IN CONSOLE
for (Map.Entry<Character, Integer> e : listahan.entrySet()){
System.out.printf("%1d.) %-15s : %-3d%n", ctr++, e.getKey(), e.getValue());
}
the output was :
1.) : 1 // (must be a null)
2.) a : 1
3.) þ : 1
4.) ÿ : 1
why is the output not like this one?:
1.) a :1
You ran into a Byte Order Mark, being U+FEFF, which is, when read as separate bytes, equivalent to 254 and 255.
This (together with the occurence of the null) probably implies that the file is encoded in UTF-16 or UCS-2 (aka widestring, wchar, …). I suggest you have a read up on unicode encodings if you don’t know what that means. For this, I recommend the great article The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!).