There are two convenience interfaces declared in header file locale : std::wstring_convert and std::wbuffer_convert. However, the usage examples are absent.
Are there any concise examples to illustrate their usages and differences?
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
std::wstring_convertGiven an
std::u32string(a.k.a.std::basic_string<char32_t>) that holds UTF-32 code units in the form ofchar32_telements, here’s how to convert it to a sequence of UTF-8 code units in the form of bytes:Take note that a quirk of
std::wstring_convertis that is always converts what the Standard calls a wide string (which is in fact any kind of specialization ofstd::basic_string, includingstd::string) to or from a byte string, which is a specialization of the formstd::basic_string<char, std::char_traits<char>, Allocator>.What the source and target encodings will be depends on what code conversion facet is used — here I am using one of the stock facets that come from
<codecvt>. Any code conversion facet will do as long as it is Destructible, which is not the case for e.g.std::codecvt<wchar_t>— it has a protected destructor.std::wbuffer_convertHere’s a hopefully compelling use case: you have an
outobject which is an instance ofstd::ostream(a.k.astd::basic_ostream<char>) that expects UTF-8 encoded text. So for instanceout << u8"Hello"should work just fine. As it so happens though, you have a lot of UTF-32 encoded wide-strings (best candidate for that job would bestd::u32string) coming from somewhere else in your program, which you need to pass toout. You could usestd::wstring_convertrepeatedly, but that can get old quickly.Here’s another way:
That is, we can get a view of
outthat behaves as if it were an instance ofstd::basic_stream<char32_t>and that expects UTF-32 encoded text, and we didn’t alter locales (that last bit being a big reason those convenience interfaces exist in the first place).I’d like to think that
std::wbuffer_convertis complementary tostd::wstring_convertrather than a competitor.As a disclaimer, because I haven’t laid my hands on an implementation that supports either of those features or
<codecvt>, the code here is completely untested :(.