I thought UCS-2 to ISO-8859-1 was the same as rawData = new byte[data.length()]; for(int

Question

0

Asked: May 24, 20262026-05-24T10:37:49+00:00 2026-05-24T10:37:49+00:00

I thought UCS-2 to ISO-8859-1 was the same as rawData = new byte[data.length()]; for(int

0

I thought UCS-2 to ISO-8859-1 was the same as

    rawData = new byte[data.length()];
    for(int i=0; i<data.length(); i++) {
        rawData[i] = (byte)(data.charAt(i) & 0xff);
    }

This seems to be false. Why isn’t the above code equivalent to data.getBytes("ISO8859_1") instead? I’m on Android.

In fact, it turns out that some of my characters were 0xf700 & (byte). For some reason this happens when you fetch a binary file with XMLHttpRequest and Charset: x-user-defined. When converting to latin1 those characters turn into ? (question marks).

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-24T10:37:50+00:00

Per thw Android documentation:

The behavior when this string cannot be represented in the named charset is unspecified.

In practice, this call ends up as the variant that takes an explicit Charset, which will substitute some replacement sequence for untranslatable characters. In the Sun JDK, this is a single-byte value 64 (‘?’).

However, in your comment to the earlier answer, you guarantee that there are no character values greater than ‘0xFF’ in the string, then you’re doing something wrong. ISO-8859-1 is a proper subset of UCS-2/UTF-16.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I thought UCS-2 to ISO-8859-1 was the same as rawData = new byte[data.length()]; for(int

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply