How to put a supplementary Unicode character (say, codepoint 10400) in a string literal?
I have tried putting a surrogate pair like this:
String text = "TEST \uD801\uDC00";
System.out.println(text);
but it doesn’t seem to work.
UPDATE:
The good news is, the string is constructed properly.
Byte array in UTF-8: 54 45 53 54 20 f0 90 90 80
Byte array in UTF-16: fe ff 0 54 0 45 0 53 0 54 0 20 d8 1 dc 0
But the bad news is, it is not printed properly (in my Fedora box) and I can see a square instead of the expected symbol (my console didn’t support unicode properly).
“Works for me”, what exactly is the issue?
Output:
Note that length — like most String methods — deals with
chars, not Unicode characters. So much for awesome Unicode support 🙂Happy coding.