I have a UTF-8 string (created an std::string from a byte array) I understand

Question

0

Asked: May 16, 20262026-05-16T18:30:48+00:00 2026-05-16T18:30:48+00:00

I have a UTF-8 string (created an std::string from a byte array) I understand

0

I have a UTF-8 string (created an std::string from a byte array)
I understand that the encoding means that the size()/length() won’t give me the actual number of glyphs if the text is chinese for instance…
I understand that in order to get the unicode character code of each glyph I need to convert it to wstring (or any UTF>8 representation) and then I can get the value that will represent what I want.

I’ve looked around and haven’t found any simple way to do it with std c++.
What am I missing?

I’m compiling gcc 4+ on Apple’s iPhone using cocoa-touch framework.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-16T18:30:49+00:00

To get the number of utf8 ‘characters/code points’ in a std::string you could do this : Traverse the string, if the char is between 0 and 127, it’s a one byte character, between 194 and 223 it’s a 2 bytes character (so advance in consequence), between 224 and 239 it’s a 3 bytes character (so advance in consequence), between 240 and 244 it’s a 4 bytes character (so advance in consequence).

Since wchar_t on the Iphone is, I guess, 32bits, if you really want a wstring you could use UTF8CPP to convert to UTF32. UTF8CPP could also give you the code points of your string.

But I don’t understand why you’re using C++ for the Iphone ? Look here : Objective-C Tuesdays: wide character strings

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a UTF-8 string (created an std::string from a byte array) I understand

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply