size of char is : 2 (msdn)
sizeof(char) //2
a test :
char[] c = new char[1] {'a'};
Encoding.UTF8.GetByteCount(c) //1 ?
why the value is 1?
(of course if c is a unicode char like ‘ש’ so it does show 2 as it should.)
a is not .net char ?
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
It’s because ‘a’ only takes one byte to encode in UTF-8.
Encoding.UTF8.GetByteCount(c)will tell you how many bytes it takes to encode the given array of characters in UTF-8. See the documentation forEncoding.GetByteCountfor more details. That’s entirely separate from how wide thechartype is internally in .NET.Each character with code points less than 128 (i.e. U+0000 to U+007F) takes a single byte to encode in UTF-8.
Other characters take 2, 3 or even 4 bytes in UTF-8. (There are values over U+1FFFF which would take 5 or 6 bytes to encode, but they’re not part of Unicode at the moment, and probably never will be.)
Note that the only characters which take 4 bytes to encode in UTF-8 can’t be encoded in a single
charanyway. Acharis a UTF-16 code unit, and any Unicode code points over U+FFFF require two UTF-16 code units forming a surrogate pair to represent them.