i have this RandomAccessFile object that reads bytes from a file and stores them into a byte array. they should compose a hebrew letter string.
if i debug them in JAVA on a desktop, then for 4 bytes in hex i get, for example:
data[0]=E7
data[1]=FA
data[2]=E5
data[3]=EC
(so, 1 byte length each which makes sense)
when i construct a String str from them i get:
str[0]=\u05D7
str[1]=\u05EA
str[2]=\u05D5
str[3]=\u05DC
which are the correct unicoded hebrew letters and the string print out just fine. are they 2 bytes length each?
when i do the same debugging on an Android device i get the same “data” byte array but the “str” string is 4 identical bytes which reads out as 4 question mark.
my question is: how can java take 1 byte and “know” it’s hebrew, and how can i do it in Android just the same?
thanks
code:
iDefLength=4;
RandomAccessFile R = new RandomAccessFile(file, "r");
R.read(bDefinition, 0, iDefLength);
this.sDef = new String(bDefinition);
As for your question: No, Java cannot take 1 byte and know it’s Hebrew, or any other encoding.
While it’s possible to make an encoding guess by doing stuff like looking at many/all bytes in the file and guessing the encoding by looking at byte frequencies (I believe Microsoft does this with IE), obviously that can’t work on a single byte.