Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 9061099
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 16, 20262026-06-16T15:24:01+00:00 2026-06-16T15:24:01+00:00

libxml2 seems to store all its strings in UTF-8, as xmlChar * . /**

  • 0

libxml2 seems to store all its strings in UTF-8, as xmlChar *.

/**
 * xmlChar:
 *
 * This is a basic byte in an UTF-8 encoded string.
 * It's unsigned allowing to pinpoint case where char * are assigned
 * to xmlChar * (possibly making serialization back impossible).
 */
typedef unsigned char xmlChar;

As libxml2 is a C library, there’s no provided routines to get an std::wstring out of an xmlChar *. I’m wondering whether the prudent way to convert xmlChar * to a std::wstring in C++11 is to use the mbstowcs C function, via something like this (work in progress):

std::wstring xmlCharToWideString(const xmlChar *xmlString) {
    if(!xmlString){abort();} //provided string was null
    int charLength = xmlStrlen(xmlString); //excludes null terminator
    wchar_t *wideBuffer = new wchar_t[charLength];
    size_t wcharLength = mbstowcs(wideBuffer, (const char *)xmlString, charLength);
    if(wcharLength == (size_t)(-1)){abort();} //mbstowcs failed
    std::wstring wideString(wideBuffer, wcharLength);
    delete[] wideBuffer;
    return wideString;
}

Edit: Just an FYI, I’m very aware of what xmlStrlen returns; it’s the number of xmlChar used to store the string; I know it’s not the number of characters but rather the number of unsigned char. It would have been less confusing if I had named it byteLength, but I thought it would have been clearer as I have both charLength and wcharLength. As for the correctness of the code, the wideBuffer will be larger or equal to the required size to hold the buffer, always (I believe). As characters that require more space than wide_t will be truncated (I think).

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-16T15:24:03+00:00Added an answer on June 16, 2026 at 3:24 pm

    xmlStrlen() returns the number of UTF-8 encoded codeunits in the xmlChar* string. That is not going to be the same number of wchar_t encoded codeunits needed when the data is converted, so do not use xmlStrlen() to allocate the size of your wchar_t string. You need to call std::mbtowc() once to get the correct length, then allocate the memory, and call mbtowc() again to fill the memory. You will also have to use std::setlocale() to tell mbtowc() to use UTF-8 (messing with the locale may not be a good idea, especially if multiple threads are involved). For example:

    std::wstring xmlCharToWideString(const xmlChar *xmlString)
    {    
        if (!xmlString) { abort(); } //provided string was null
    
        std::wstring wideString;
    
        int charLength = xmlStrlen(xmlString);
        if (charLength > 0)
        {
            char *origLocale = setlocale(LC_CTYPE, NULL);
            setlocale(LC_CTYPE, "en_US.UTF-8");
    
            size_t wcharLength = mbtowc(NULL, (const char*) xmlString, charLength); //excludes null terminator
            if (wcharLength != (size_t)(-1))
            {
                wideString.resize(wcharLength);
                mbtowc(&wideString[0], (const char*) xmlString, charLength);
            }
    
            setlocale(LC_CTYPE, origLocale);
            if (wcharLength == (size_t)(-1)) { abort(); } //mbstowcs failed
        }
    
        return wideString;
    }
    

    A better option, since you mention C++11, is to use std::codecvt_utf8 with std::wstring_convert instead so you do not have to deal with locales:

    std::wstring xmlCharToWideString(const xmlChar *xmlString)
    {    
        if (!xmlString) { abort(); } //provided string was null
        try
        {
            std::wstring_convert<std::codecvt_utf8<wchar_t>, wchar_t> conv;
            return conv.from_bytes((const char*)xmlString);
        }
        catch(const std::range_error& e)
        {
            abort(); //wstring_convert failed
        }
    }
    

    An alternative option is to use an actual Unicode library, such as ICU or ICONV, to handle Unicode conversions.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm using libxml2. All function are working with xmlChar*. I found that xmlChar is
i have used libxml2.Here when i am getting an XML data as a string
I am getting this error after adding the libxml2.2.dylib file Linking /Users/Biranchi/Desktop/Funmovies TabBarController/build/Debug-iphonesimulator/funmovies.app/funmovies (1
Was following this Simple libxml2 HTML parsing example, using Objective-c, Xcode, and HTMLparser.h and
I am using libxml2 under python. Unfortunatly the python version of this library is
I have searched the net and the consensus seems to be to add ${SDK_DIR}/usr/lib/libxml2
Seems quite a few people have encountered this issue on the official Apple Developer
Getting this error: # rails c FFI::NotFoundError: Function 'xmlRelaxNGSetParserStructuredErrors' not found in [libxml2.so, libxslt.so,
This XML file contained archived news stories for all of last year. I was
I'm using libxml2 for SOAP-Actions in an iPhone App. The big problem is that

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.