Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8271325
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 8, 20262026-06-08T06:48:36+00:00 2026-06-08T06:48:36+00:00

I am developing an application of which the core code base would be cross-platform

  • 0

I am developing an application of which the core code base would be cross-platform for Windows, iOS and Android.

My question is: how should I internally represent strings used by this app to be able to effectively use them on all three platforms?

It is important to note, that I use DirectWrite heavily in Windows, of which the API functions usually expect wchar_t* to be passed (btw. the API documentation states that “A pointer to an array of Unicode characters.”, I don’t know whether this means that they are in UTF-16 encoding or not)

I see three different approaches (however I find it quite difficult to grasp the details of handling Unicode strings with C++ in a cross-platform manner, so maybe I miss some important concept):

  • use std::string internally everywhere (and store the strings in UTF-8 encoding?), and convert them to wchar_t* where it is needed for the DirectWrite API (I don’t know what is needed by the text-processing APIs of Android and iOS yet).
  • use std::wstring internally everywhere. If I understand things right, this wouldn’t be effective from memory-usage perspective, because a wchar_t is 4 bytes on iOS and Android (and does it mean that i would have to store the string in UTF-16 on Windows, and in UTF-32 on Android/iOS?)
  • create an abstraction for strings with an abstract base class, and implement internal storing specifically for the different platforms.

What would be the best solution? And by the way, are there any existing cross-platform libraries that abstract string handling? (and also, reading and serializing of Unicode strings)

(UPDATE: deleted the part with the question about the difference of char* and std::string.)

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-08T06:48:38+00:00Added an answer on June 8, 2026 at 6:48 am

    A part of my question comes from my misunderstanding, or not completely understanding how string and wstring classes work in C++ (I am coming from C# background).
    The differences of the two and pros and cons have been described in this great answer: std::wstring VS std::string.

    How string and wstring works

    For me, the single most important discovery about string and wstring classes was that semantically they do not represent a piece of encoded text, rather simply a “string” of char or wchar_t. They are more like a simple data array with some string-specific operations (like append and substr) rather than representing text. Neither of them are aware of any kind of string-encoding whatsoever, they handle each char or wchar_t element individually as a separate character.

    Encodings

    However, on most systems, if you create a string from a string literal with a special character like this:

    std::string s("ű");
    

    The ű character will be represented by more than one byte in memory, but that has nothing to do with the std::string class, that is a feature of the compiler as it can encode string literals with UTF8 (not every compiler though). (And string literals prefixed with L will be represented by wchar_t-s in either UTF16 or UTF32 or something else, depending on the compiler).
    Thus the string “ű” will be represented in memory with two bytes: 0xC5 0xB1, and the std::string class won’t know that those two bytes semantically mean one character (one Unicode code point) in UTF8, hence the sample code:

    std::string s("ű");
    std::cout << s.length() << std::endl;
    std::cout << s.substr(0, 1);
    

    produces the following result (depending on the compiler, some compilers do not take string literals as UTF8, and some compilers depend on the encoding of the source file):

    2
    �
    

    The size() function returns 2, because the only thing the std::string knows is that it stores two bytes (two chars). And substr works “primitively” as well, it returns a string containing the single char 0xC5, which is displayed as �, because it is not a valid UTF8 character (but that does not bother the std::string).

    And from that we can see that who handle encodings are the various text-processing APIs of the platform, like the simple cout, or DirectWrite.

    My approach

    In my application DirectWrite is very important, which only accepts strings encoded in UTF16 (in the form of wchar_t* pointers). So I decided to store the strings both in memory and in file encoded in UTF16. However, I wanted a cross-platform implementation which can handle the UTF16 strings on Windows, Android and iOS, which is not possible with std::wstring, because its data size (and the encoding it fits to use) is platform-dependent.

    To create a cross-platform, strictly UTF16 string class I templated basic_string on a data type which is 2 bytes long. Quite surprisingly – at least for me – I found almost no information about this online, I based my solution on this approach. Here is the code:

    // Define this on every platform to be 16 bytes!
    typedef unsigned short char16;
    
    struct char16_traits
    {
        typedef char16 _E;
        typedef _E char_type;
        typedef int int_type;
        typedef std::streampos pos_type;
        typedef std::streamoff off_type;
        typedef std::mbstate_t state_type;
        static void assign(_E& _X, const _E& _Y)
        {_X = _Y; }
        static bool eq(const _E& _X, const _E& _Y)
        {return (_X == _Y); }
        static bool lt(const _E& _X, const _E& _Y)
        {return (_X < _Y); }
        static int compare(const _E *_U, const _E *_V, size_t _N)
        {return (memcmp(_U, _V, _N * 2)); }
        static size_t length(const _E *_U)
        {
            size_t count = 0;
            while(_U[count] != 0)
            {
                count++;
            }
            return count;
        }
        static _E * copy(_E *_U, const _E *_V, size_t _N)
        {return ((_E *)memcpy(_U, _V, _N * 2)); }
        static const _E * find(const _E *_U, size_t _N, const _E& _C)
        {
            for(int i = 0; i < _N; ++i) {
                if(_U[i] == _C) {
                    return &_U[i];
                }
            }
            return 0;
        }
        static _E * move(_E *_U, const _E *_V, size_t _N)
        {return ((_E *)memmove(_U, _V, _N * 2)); }
        static _E * assign(_E *_U, size_t _N, const _E& _C)
        {
            for(size_t i = 0; i < _N; ++i) {
                assign(_U[i], _C);
            }
            return _U;
        }
        static _E to_char_type(const int_type& _C)
        {return ((_E)_C); }
        static int_type to_int_type(const _E& _C)
        {return ((int_type)(_C)); }
        static bool eq_int_type(const int_type& _X, const int_type& _Y)
        {return (_X == _Y); }
        static int_type eof()
        {return (EOF); }
        static int_type not_eof(const int_type& _C)
        {return (_C != eof() ? _C : !eof()); }
    };
    
    typedef std::basic_string<unsigned short, char16_traits> utf16string;
    

    Strings are stored with the above class, and the raw UTF16 data is passed to the specific API functions of the various platforms, all of which at the moment seems to support UTF16 encoding.
    The implementation might not be perfect, however the append, substr and size functions seem to work properly. I still don’t have much experience with string handling in C++ so feel free to comment/edit if I stated something incorrectly.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I am developing iPhone application which involves the frameworks available from iOS 3.2 (Core
I am developing application which intend to be cross platform. I used to use
Am developing android application which contains floating menu effect from left to right. I
I developing application which using geo-location. Should i ask user right for this when
I'm developing an application which has to run on Linux and Windows. I have
I'm developing an iOS application which deals with constantly changing data which is read
I am developing windows application which supports two language Arabic and English. I want
I'm developing a Grails (Version 1.3.3) Web-Application using the Grails Spring-Security Plugin, Spring-Security-Core-1.0.1 (which,
I am developing application which is having Gujarati font in text-view but my problem
When developing an application which mostly interacts with a database, what is a good

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.