I found that wcslen() in VC++2010 returns correct count of letters; meanwhile Xcode does not.
For example, the code below returns correct 11 in VC++ 2010, but returns incorrect 17 in Xcode 4.2.
const wchar_t *p = L"123abc가1나1다";
size_t plen = wcslen(p);
I guess Xcode app stores wchar_t string as UTF-8 in memory. This is another strange thing.
How can I get 11 just like VC++ in Xcode too?
I ran this program on a Mac Mini running MacOS X 10.7.2 (Xcode 4.2):
When I do a hex dump of the source file, I see:
The output when compiled with GCC is:
Note that the string is truncated at the zero byte – I think that is probably a bug in the system, but it seems a little unlikely that I’d manage to find one on my first attempt at using
wprintf(), so it is more likely I’m doing something wrong.You’re right, in the multi-byte UTF-8 source code, the string occupies 17 bytes (8 one-byte basic Latin-1 characters, and 3 characters each encoded using 3 bytes). So, the raw
strlen()on the source string would return 17 bytes.GCC version is:
Just for giggles, I tried
clang, and I get a different result. Compiled using:using:
The output when compiled with
clangis:So, now the string appears correctly, but the length is given as 17 instead of 11. Superficially, you can take your choice of bugs – string looks OK (in a terminal – /Applications/Utilities/Terminal – acclimatized to UTF8) but length is wrong, or length is right but string does not appear correctly.
I note that
sizeof(wchar_t)in bothgccandclangis 4.The left hand does not understand what the right hand is doing. I think there’s a case for claiming both are broken, in different ways.