Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6938003
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 27, 20262026-05-27T12:26:10+00:00 2026-05-27T12:26:10+00:00

In C++, I want to use Unicode to do things. So after falling down

  • 0

In C++, I want to use Unicode to do things. So after falling down the rabbit hole of Unicode, I’ve managed to end up in a train wreck of confusion, headaches and locales.

But in Boost I’ve had the unfortunate problem of trying to use Unicode file paths and trying to use the Boost program options library with Unicode input. I’ve read whatever I could find on the subjects of locales, codecvts, Unicode encodings and Boost.

My current attempt to get things to work is to have a codecvt that takes a UTF-8 string and converts it to the platform’s encoding (UTF-8 on POSIX, UTF-16 on Windows), I’ve been trying to avoid wchar_t.

The closest I’ve actually gotten is trying to do this with Boost.Locale, to convert from a UTF-8 string to a UTF-32 string on output.

#include <string>
#include <boost/locale.hpp>
#include <locale>

int main(void)
{
  std::string data("Testing, 㤹");

  std::locale fromLoc = boost::locale::generator().generate("en_US.UTF-8");
  std::locale toLoc   = boost::locale::generator().generate("en_US.UTF-32");

  typedef std::codecvt<wchar_t, char, mbstate_t> cvtType;
  cvtType const* toCvt = &std::use_facet<cvtType>(toLoc);

  std::locale convLoc = std::locale(fromLoc, toCvt);

  std::cout.imbue(convLoc);
  std::cout << data << std::endl;

  // Output is unconverted -- what?

  return 0;
}

I think I had some other kind of conversion working using wide characters, but I really don’t know what I’m even doing. I don’t know what the right tool for the job is at this point. Help?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-27T12:26:10+00:00Added an answer on May 27, 2026 at 12:26 pm

    Okay, after a long few months I’ve figured it out, and I’d like to help people in the future.

    First of all, the codecvt thing was the wrong way of doing it. Boost.Locale provides a simple way of converting between character sets in its boost::locale::conv namespace. Here’s one example (there’s others not based on locales).

    #include <boost/locale.hpp>
    namespace loc = boost::locale;
    
    int main(void)
    {
      loc::generator gen;
      std::locale blah = gen.generate("en_US.utf-32");
    
      std::string UTF8String = "Tésting!";
      // from_utf will also work with wide strings as it uses the character size
      // to detect the encoding.
      std::string converted = loc::conv::from_utf(UTF8String, blah);
    
      // Outputs a UTF-32 string.
      std::cout << converted << std::endl;
    
      return 0;
    }
    

    As you can see, if you replace the “en_US.utf-32” with “” it’ll output in the user’s locale.

    I still don’t know how to make std::cout do this all the time, but the translate() function of Boost.Locale outputs in the user’s locale.

    As for the filesystem using UTF-8 strings cross platform, it seems that that’s possible, here’s a link to how to do it.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

We want to use Unicode with Delphi 2009 and Interbase, and found that to
I want to use Unicode in my code. My Unicode value is 0100 and
I want to use UUID() to generate primary keys. After some investigation, I thought
I have a object which contains unicode data and I want to use that
How do I use Unicode with PHP ? I want to store Unicode value
I want use groovy findAll with my param to filtering closure filterClosure = {
i want use some data from a website with web service. i have a
I have a transaction log file in CSV format that I want use to
Below is my stored procedure. I want use stored procedure select all row of
I want to use the mouse scrollwheel in my OpenGL GLUT program to zoom

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.