How to get encoded version of string (e.g. \u0421\u043b\u0443\u0436\u0435\u0431\u043d\u0430\u044f) using Java?
EDIT:
I guess the question is not very clear… Basically what I want is this:
Given string s=”blalbla” I want to get string “\uXXX\uYYYY”
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
You will need to extract each code point/unit from the String and encode it yourself. The following works for all Strings even if the individual linguistic characters within the String are composed of digraphs or ligatures.
The above produces output as dictated by the Java Language Specification on Unicode escapes, i.e. it produces output of the form
\uxxxxfor each UTF-16 code unit. It addresses supplementary characters by producing a pair of code units represented as\uxxxx\uyyyy.The originally posted code has been modified to produce Unicode codepoints in the format
U+FFFFF:The gruntwork is done by the String.codePointAt() method which returns the Unicode codepoint at a particular index. For a String instance composed of combinational characters, the length of the String instance will not be the length of the number of visible characters, but the number of actual Unicode codepoints. For example,
कand्combine to formक्in Devanagari, and the above function will rightfully returnU+0915 U+094dwithout any fuss asString.length()will return 2 for the combined character. Strings with supplementary characters will be with single codepoints for the individual characters –(the page will not display this String literal correctly, but you can copy this just fine; it should beJavascriptbut written using the supplementary character set for Mathematical alphanumeric symbols) will returnU+1d4a5 U+1d4b6 U+1d4cb U+1d4b6 U+1d4c8 U+1d4b8 U+1d4c7 U+1d4be U+1d4c5 U+1d4c9.