I’m writing some unit tests which are going to verify our handling of various resources that use other character sets apart from the normal latin alphabet: Cyrilic, Hebrew etc.
The problem I have is that I cannot find a way to embed the expectations in the test source file: here’s an example of what I’m trying to do…
/// /// Protected: TestGetHebrewConfigString /// void CPrIniFileReaderTest::TestGetHebrewConfigString() { prwstring strHebrewTestFilePath = GetTestFilePath( strHebrewTestFileName ); CPrIniFileReader prIniListReader( strHebrewTestFilePath.c_str() ); prIniListReader.SetCurrentSection( strHebrewSubSection ); CPPUNIT_ASSERT( prIniListReader.GetConfigString( L'דונדארןמע' ) == L'דונהשךוק') ); }
This quite simply doesnt work. Previously I worked around this using a macro which calls a routine to transform a narrow string to a wide string (we use towstring all over the place in our applications so it’s existing code)
#define UNICODE_CONSTANT( CONSTANT ) towstring( CONSTANT ) wstring towstring( LPCSTR lpszValue ) { wostringstream os; os << lpszValue; return os.str(); }
The assertion in the test above then became:
CPPUNIT_ASSERT( prIniListReader.GetConfigString( UNICODE_CONSTANT( 'דונדארןמע' ) ) == UNICODE_CONSTANT( 'דונהשךוק' ) );
This worked OK on OS X but now I’m porting to linux and I’m finding that the tests are all failing: it all feels rather hackish as well. Can anyone tell me if they have a nicer solution to this problem?
A tedious but portable way is to build your strings using numeric escape codes. For example:
becomes:
You have to convert all your Unicode characters to numeric escapes. That way your source code becomes encoding-independent.
You can use online tools for conversion, such as this one. It outputs the JavaScript escape format
\uXXXX, so just search & replace\uwith\xto get the C format.