Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 629483
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 13, 20262026-05-13T19:45:04+00:00 2026-05-13T19:45:04+00:00

Suppose we have an arbitrary string, s . s has the property of being

  • 0

Suppose we have an arbitrary string, s.

s has the property of being from just about anywhere in the world. People from USA, Japan, Korea, Russia, China and Greece all write into s from time to time. Fortunately we don’t have time travellers using Linear A, however.

For the sake of discussion, let’s presume we want to do string operations such as:

  • reverse
  • length
  • capitalize
  • lowercase
  • index into

and, just because this is for the sake of discussion, let’s presume we want to write these routines ourselves (instead of grabbing a library), and we have no legacy software to maintain.

There are 3 standards for Unicode: utf-8, utf-16, and utf-32, each with pros and cons. But let’s say I’m sorta dumb, and I want one Unicode to rule them all (because rolling a dynamically adapting library for 3 different kinds of string encodings that hides the difference from the API user sounds hard).

  • Which encoding is most general?
  • Which encoding is supported by wchar_t?
  • Which encoding is supported by the STL?
  • Are these encodings all(or not at all) null-terminated?

—

The point of this question is to educate myself and others in useful and usable information for Unicode: reading the RFCs is fine, but there’s a ‘stack’ of information related to compilers, languages, and operating systems that the RFCs do not cover, but is vital to know to actually use Unicode in a real app.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-13T19:45:04+00:00Added an answer on May 13, 2026 at 7:45 pm
    1. Which encoding is most general
      Probably UTF-32, though all three formats can store any character. UTF-32 has the property that every character can be encoded in a single codepoint.

    2. Which encoding is supported by wchar_t
      None. That’s implementation defined. On most Windows platforms it’s UTF-16, on most Unix platforms its UTF-32.

    3. Which encoding is supported by the STL
      None really. The STL can store any type of character you want. Just use the std::basic_string<t> template with a type large enough to hold your code point. Most operations (e.g. std::reverse) do not know about any sort of unicode encoding though.

    4. Are these encodings all(or not at all) null-terminated?
      No. Null is a legal value in any of those encodings. Technically, NULL is a legal character in plain ASCII too. NULL termination is a C thing — not an encoding thing.

    Choosing how to do this has a lot to do with your platform. If you’re on Windows, use UTF-16 and wchar_t strings, because that’s what the Windows API uses to support unicode. I’m not entirely sure what the best choice is for UNIX platforms but I do know that most of them use UTF-8.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Suppose I have a table called Companies that has a DepartmentID column. There's also
Suppose I have a small inheritance hierarchy of Animals: public interface IAnimal { string
Suppose you have 2 different ASP.NET applications in IIS. Also, you have some ASCX
Suppose I have a stringbuilder in C# that does this: StringBuilder sb = new
Suppose we have a table A: itemid mark 1 5 2 3 and table
Suppose I have the following CSS rule in my page: body { font-family: Calibri,
Suppose I have a class module clsMyClass with an object as a member variable.
Suppose I have: Toby Tiny Tory Tily Is there an algorithm that can easily
Suppose I have BaseClass with public methods A and B, and I create DerivedClass
Suppose I have two applications written in C#. The first is a third party

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.