I’m trying to store a wchar_t string as octets, but I’m positive I’m doing

Question

0

Asked: May 15, 20262026-05-15T22:13:15+00:00 2026-05-15T22:13:15+00:00

I’m trying to store a wchar_t string as octets, but I’m positive I’m doing

0

I’m trying to store a wchar_t string as octets, but I’m positive I’m doing it wrong – anybody mind to validate my attempt? What’s going to happen when one char will consume 4 bytes?

  unsigned int i;
  const wchar_t *wchar1 = L"abc";
  wprintf(L"%ls\r\n", wchar1);

  for (i=0;i< wcslen(wchar1);i++) {
    printf("(%d)", (wchar1[i]) & 255);
    printf("(%d)", (wchar1[i] >> 8) & 255);
  }

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-15T22:13:16+00:00

Unicode text is always encoded. Popular encodings are UTF-8, UTF-16 and UTF-32. Only the latter has a fixed size for a glyph. UTF-16 uses surrogates for codepoints in the upper planes, such a glyph uses 2 wchar_t. UTF-8 is byte oriented, it uses between 1 and 4 bytes to encode a codepoint.

UTF-8 is an excellent choice if you need to transcode the text to a byte oriented stream. A very common choice for text files and HTML encoding on the Internet. If you use Windows then you can use WideCharToMultiByte() with CodePage = CP_UTF8. A good alternative is the ICU library.

Be careful to avoid byte encodings that translate text to a code page, such as wcstombs(). They are lossy encodings, glyphs that don’t have a corresponding character code in the code page are replaced by ?.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying to store a wchar_t string as octets, but I’m positive I’m doing

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply