So far as I know, when JRE executes an Java application,
the string will be seen as a USC2 byte array internally.
In wikipedia, the following content can be found.
Java originally used UCS-2, and added UTF-16 supplementary character support in J2SE 5.0.
With the new release version of Java (Java 7) ,
what is its internal character-encoding?
Is there any possibility that Java start to use UCS-4 internally ?
Java 7 still uses UTF-16 internally (Read the last section of the Charset Javadoc), and it’s very unlikely that will change to UCS-4. I’ll give you two reasons for that: