I’m trying to figure out the safest way to retrieve unicode data in a

Question

0

Asked: May 26, 20262026-05-26T12:38:04+00:00 2026-05-26T12:38:04+00:00

I’m trying to figure out the safest way to retrieve unicode data in a

0

I’m trying to figure out the safest way to retrieve unicode data in a unified method from remote computers and make sure that data stays consistent and readable.

Computer A: Chinese user, mixed English Windows 7, some registry values contain Chinese letters like L”您好”

Computer B: US English, no unicode values returned from my functions

Computer C: Introduces an agent to Computer A and B.

The agent: assesses the health and security of the computer from the inside. One unicode aware section is simply getting registry values i.e:

int Utilities::GetRegistryStringValue(HKEY h_sub_key, WCHAR* value_name, wstring &result)
{
DWORD cbData = 8;
LPDWORD type = NULL;

//Get the size and type of the key
long err = RegQueryValueEx(h_sub_key, value_name, NULL, type, NULL, &cbData);

if (err != ERROR_SUCCESS)
{
    if (err != ERROR_FILE_NOT_FOUND)
        debug->DebugMessage(Error::GetErrorMessageW(err));
    return err;
}

result.resize(cbData / sizeof(WCHAR));

LPWSTR res = new WCHAR[(cbData + sizeof(L'\0')) / sizeof(WCHAR)];

err = RegQueryValueEx(h_sub_key, value_name, NULL, NULL, (LPBYTE) &res[0], &cbData);

if(err != ERROR_SUCCESS && err != ERROR_FILE_NOT_FOUND)
{
    debug->DebugMessage(Error::GetErrorMessageW(err));
    return err;
}

res[cbData / sizeof(WCHAR)] = L'\0';

result = wstring(res);

return ERROR_SUCCESS;

}

Those values will be stored in an XML file.
Should that XML file be in UTF16 or UTF8?
Am I going to need to pass the remote system’s code page back for translation?
What other issues might I have?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-26T12:38:05+00:00

UTF8 is more standard (for networking) because it does not have endian issues. For UTF16 you’ll need to specify an endian-ness for the transmission. If you’re using a unicode format, you do not need a code page.

You can do the translation with standard windows calls like WideCharToMultiByte if they’re on windows machines.

std::wstring buffer_with_utf16;
const char DefaultChar = 1; //not null, but not normal either
bool had_conversion_error = false;    
int alength = WideCharToMultiByte(CP_UTF8, 0, 
              buffer_with_utf16.cstr(), buffer_with_utf16.size(),
              NULL, 0, 
              &DefaultChar, &had_conversion_error);
if (alength == 0)
    throw std::logic_error("Bad UTF8 conversion"); //use GetLastError
std::string buffer_with_utf8(alength+1);
int error = WideCharToMultiByte(CP_UTF8, 0, 
              buffer_with_utf16.cstr(), buffer_with_utf16.size(),
              &buffer_with_utf8[0], buffer_with_utf8.size(), 
              &DefaultChar, &had_conversion_error);
if (error == 0)
    throw std::logic_error("Bad UTF8 conversion"); //use GetLastError

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying to figure out the safest way to retrieve unicode data in a

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply