Well, one easy way is to just explicitly list them…

Question

0

Asked: May 11, 20262026-05-11T22:49:00+00:00 2026-05-11T22:49:00+00:00

This is an ANSI C question. I have the following code. #include <stdio.h> #include

0

This is an ANSI C question. I have the following code.

#include <stdio.h>
#include <locale.h>
#include <wchar.h>

  int main()
  {
    if (!setlocale(LC_CTYPE, "")) {
      printf( "Can't set the specified locale! "
              "Check LANG, LC_CTYPE, LC_ALL.\n");
      return -1;
    }
    wint_t c;
    while((c=getwc(stdin))!=WEOF)
      {
    printf("%lc",c);
      }
    return 0;
  }

I need full UTF-8 support, but even at this simplest level, can I improve this somehow? Why is wint_t used, and not wchar, with appropriate changes?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-11T22:49:00+00:00

UTF-8 is one possible encoding for Unicode. It defines 1, 2, 3 or 4 bytes per character. When you read it through getwc(), it will fetch one to four bytes and compose from them a single Unicode character codepoint, which would fit within a wchar (which can be 16 or even 32 bits wide, depending on platform).

But since Unicode values map to all of the values from 0x0000 to 0xFFFF, there are no values left to return condition or error codes in. (Some have pointed out that Unicode is larger than 16 bits, which is true; in those cases surrogate pairs are used. But the point here is that Unicode uses all of the available values leaving none for EOF.)

Various error codes include EOF (WEOF), which maps to -1. If you were to put the return value of getwc() in a wchar, there would be no way to distinguish it from a Unicode 0xFFFF character (which, BTW, is reserved anyway, but I digress).

So the answer is to use a wider type, an wint_t (or int), which holds at least 32 bits. That gives the lower 16 bits for the real value, and anything with a bit set outside of that range means something other than a character returning happened.

Why don’t we always use wchar then instead of wint? Most string-related functions use wchar because on most platforms it’s ½ the size of wint, so strings have a smaller memory footprint.

How to approach applying for a job at a company ...

How to handle personal stress caused by utterly incompetent and ...

What is a programmer’s life like?

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions