how can I convert a wchar_t ( ‘9’ ) to a digit in the

Question

0

Editorial Team

Asked: May 22, 20262026-05-22T17:43:23+00:00 2026-05-22T17:43:23+00:00

how can I convert a wchar_t ( ‘9’ ) to a digit in the

0

how can I convert a wchar_t ('9') to a digit in the form of an int (9)?

I have the following code where I check whether or not peek is a digit:

if (iswdigit(peek)) {
    // store peek as numeric
}

Can I just subtract '0' or is there some Unicode specifics I should worry about?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-22T17:43:24+00:00

If the question concerns just '9' (or one of the Roman
digits), just subtracting '0' is the correct solution. If
you’re concerned with anything for which iswdigit returns
non-zero, however, the issue may be far more complex. The
standard says that iswdigit returns a non-zero value if its
argument is “a decimal digit wide-character code [in the current
local]”. Which is vague, and leaves it up to the locale to
define exactly what is meant. In the “C” locale or the “Posix”
locale, the “Posix” standard, at least, guarantees that only the
Roman digits zero through nine are considered decimal digits (if
I understand it correctly), so if you’re in the “C” or “Posix”
locale, just subtracting ‘0’ should work.

Presumably, in a Unicode locale, this would be any character
which has the general category Nd. There are a number of
these. The safest solution would be simply to create something
like (variables here with static lifetime):

wchar_t const* const digitTables[] =
{
    L"0123456789",
    L"\u0660\u0661\u0662\u0663\u0664\u0665\u0666\u0667\u0668\u0669",
    // ...
};

//!     \return
//!         wch as a numeric digit, or -1 if it is not a digit
int asNumeric( wchar_t wch )
{
    int result = -1;
    for ( wchar_t const* const* p = std::begin( digitTables );
            p != std::end( digitTables ) && result == -1;
            ++ p ) {
        wchar_t const* q = std::find( *p, *p + 10, wch );
        if ( q != *p + 10 ) {
            result = q - *p;
    }
    return result;
}

If you go this way:

you’ll definitely want to download the
UnicodeData.txt file from the Unicode consortium
(“Uncode Character
Database“—this page has a links to both the Unicode data
file and an explination of the encodings used in it), and
possibly write a simple parser of this file to extract the
information automatically (e.g. when there is a new version of
Unicode)—the file is designed for simple programmatic
parsing.

Finally, note that solutions based on ostringstream and
istringstream (this includes boost::lexical_cast) will not
work, since the conversions used in streams are defined to only
use the Roman digits. (On the other hand, it might be
reasonable to restrict your code to just the Roman digits. In
which case, the test becomes if ( wch >= L'0' && wch <= L'9' ),
and the conversion is done by simply subtracting L'0'—
always supposing the the native encoding of wide character
constants in your compiler is Unicode (the case, I’m pretty
sure, of both VC++ and g++). Or just ensure that the locale is
“C” (or “Posix”, on a Unix machine).

EDIT: I forgot to mention: if you’re doing any serious Unicode programming, you
should look into ICU. Handling Unicode
correctly is extremely non-trivial, and they’ve a lot of functionality already
implemented.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

how can I convert a wchar_t ( ‘9’ ) to a digit in the

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply