Explicit will not help here as the constructor is not…

Question

0

Asked: May 11, 20262026-05-11T02:15:36+00:00 2026-05-11T02:15:36+00:00

Looking at the unicode standard , they recommend to use plain char s for

0

Looking at the unicode standard, they recommend to use plain chars for storing UTF-8 encoded strings. Does this work as expected with C++ and the basic std::string, or do cases exist in which the UTF-8 encoding can create problems?

For example, when computing the length, it may not be identical to the number of bytes – how is this supposed to be handled? Reading the standard, I’m probably fine using a char array for storage, but I’ll still need to write functions like strlen etc. on my own, which work on encoded text, cause as far as I understand the problem, the standard routines are either ASCII only, or expect wide literals (16bit or more), which are not recommended by the unicode standard. So far, the best source I found about the encoding stuff is a post on Joel’s on Software, but it does not explain what we poor C++ developer should use 🙂

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

score 0 · Answer 1 · 2026-05-11T02:15:37+00:00

2026-05-11T02:15:37+00:00Added an answer on May 11, 2026 at 2:15 am

There’s a library called ‘UTF8-CPP‘, which lets you store your UTF-8 strings in standard std::string objects, and provides additional functions to enumerate and manipulate utf-8 characters.

I haven’t tested it yet, so I don’t know what it’s worth, but I am considering using it myself.

0

Reply
Share
Share

- Report

How to approach applying for a job at a company ...

How to handle personal stress caused by utterly incompetent and ...

What is a programmer’s life like?

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions