I thought UCS-2 to ISO-8859-1 was the same as
rawData = new byte[data.length()];
for(int i=0; i<data.length(); i++) {
rawData[i] = (byte)(data.charAt(i) & 0xff);
}
This seems to be false. Why isn’t the above code equivalent to data.getBytes("ISO8859_1") instead? I’m on Android.
In fact, it turns out that some of my characters were 0xf700 & (byte). For some reason this happens when you fetch a binary file with XMLHttpRequest and Charset: x-user-defined. When converting to latin1 those characters turn into ? (question marks).
Per thw Android documentation:
In practice, this call ends up as the variant that takes an explicit
Charset, which will substitute some replacement sequence for untranslatable characters. In the Sun JDK, this is a single-byte value 64 (‘?’).However, in your comment to the earlier answer, you guarantee that there are no character values greater than ‘0xFF’ in the string, then you’re doing something wrong. ISO-8859-1 is a proper subset of UCS-2/UTF-16.