How many bytes are required to store one character in:
- Microsoft’s implementation of the .NET framework, version 4
- JavaScript, as implemented by Microsoft Internet Explorer 8?
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
Both .NET and JavaScript use UTF-16. UTF-16 is a so-called variable-length encoding which uses 16-bit code units to represent Unicode code points (which are 21 bits in length). Historically it came from UCS-2 when Unicode was still a 16-bit code (which was deemed insufficient later, thus the expansion to 21 bits).
Since UTF-16 uses 16-bit code units the code itself is a 16-bit code, but to represent a character, you’ll have to look a bit closer to what you actually mean:
Character in the Unicode sense means Unicode code point which is probably your intended meaning. Here are two cases:
Character in the usual meaning often refers to graphemes, actually, which would be what we perceive as a single character. Those can have arbitrarily many diacritics, or may be ligatures that are formed out of multiple code points by the rendering engine. Long story short in this case: Those can be arbitrarily long since they can consist of several code points.