It always depends on the situation. If you KNOW there…

Question

0

Asked: May 11, 20262026-05-11T01:58:33+00:00 2026-05-11T01:58:33+00:00

I’m writing some unit tests which are going to verify our handling of various

0

I’m writing some unit tests which are going to verify our handling of various resources that use other character sets apart from the normal latin alphabet: Cyrilic, Hebrew etc.

The problem I have is that I cannot find a way to embed the expectations in the test source file: here’s an example of what I’m trying to do…

/// /// Protected: TestGetHebrewConfigString ///   void CPrIniFileReaderTest::TestGetHebrewConfigString() {     prwstring strHebrewTestFilePath = GetTestFilePath( strHebrewTestFileName );     CPrIniFileReader prIniListReader( strHebrewTestFilePath.c_str() );     prIniListReader.SetCurrentSection( strHebrewSubSection );         CPPUNIT_ASSERT( prIniListReader.GetConfigString( L'דונדארןמע' ) == L'דונהשךוק') ); }

This quite simply doesnt work. Previously I worked around this using a macro which calls a routine to transform a narrow string to a wide string (we use towstring all over the place in our applications so it’s existing code)

#define UNICODE_CONSTANT( CONSTANT ) towstring( CONSTANT )  wstring towstring( LPCSTR lpszValue ) {     wostringstream os;     os << lpszValue;     return os.str(); }

The assertion in the test above then became:

CPPUNIT_ASSERT( prIniListReader.GetConfigString( UNICODE_CONSTANT( 'דונדארןמע' ) ) == UNICODE_CONSTANT( 'דונהשךוק' ) );

This worked OK on OS X but now I’m porting to linux and I’m finding that the tests are all failing: it all feels rather hackish as well. Can anyone tell me if they have a nicer solution to this problem?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

score 0 · Answer 1 · 2026-05-11T01:58:34+00:00

A tedious but portable way is to build your strings using numeric escape codes. For example:

wchar_t *string = L'דונדארןמע';

becomes:

wchar_t *string = '\x05d3\x05d5\x05e0\x05d3\x05d0\x05e8\x05df\x05de\x05e2';

You have to convert all your Unicode characters to numeric escapes. That way your source code becomes encoding-independent.

You can use online tools for conversion, such as this one. It outputs the JavaScript escape format \uXXXX, so just search & replace \u with \x to get the C format.

How to approach applying for a job at a company ...

How to handle personal stress caused by utterly incompetent and ...

What is a programmer’s life like?

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions