There is a String variable containing ascii characters and double bytes characters(for example, the Chinese, Japanese,…).
How to decide the total length of the String ? Also, I want to implement with the string substring/replace function.
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
As other’s have said, Java Strings are conceptually read-only arrays of Java characters, and the “length” of a String is the number of characters. However, there are complicating issues:
A Java character is not necessarily what you think of as a character. In particular, there are more Unicode characters (code-points) than can be represented using Java characters. Some Unicode code-points require two Java characters to represent them. (This is the “extended plane” issue that Thilo refers to.)
Some JVMs (with the appropriate JVM flags set at startup) will use a String representation where the characters are encoded in UTF-8. While the length of the String is the same (in this case, the number of Java characters represented by the UTF-8), the memory used can be significantly less.
Then there is the question of how many bytes are required to represent the String’s characters as UTF-8, or in some other encoding. As far as I know, the only JVM provided way to find that out is to do the conversion; e.g. using
getBytes(charSet).Finally, there is the question of how many bytes a String occupies in the heap. You can find out how many bytes are in the
Stringobject and its associatedchar[]backing object. However, predicting what that is going to be can be tricky, when you consider thatsubstringand otherStringmethods can create sets of strings that share a single backing array.