I’m using NUnit v2.5 to compare strings that contain composite Unicode characters.
Although comparison itself works fine, a caret indicating first difference seems to be misplaced.
UPD: I’ve ended up with overridden EqualConstraint that in turn invokes a custom TextMessageWriter, so I no longer need an answer. See for solution below.
Here’s the snippet:
string s1 = "ใช้งานง่าย";
string s2 = "ใช้งานงาย";
Assert.That(s1, Is.EqualTo(s2));
Here’s the output:
Expected: "ใช้งานงาย"
But was: "ใช้งานง่าย"
------------------^
The arrow indicating first different character seems to be off 2 positions (as many as there are tone marks above). For longer strings, it becomes a real pain.
I have attempted String.Normalize() but it wouldn’t work either.
How can I overcome this problem? Thanks for your help. See my answer below.
I think I cannot find any better answer, so answering my own question.
Cause.
There are many languages using non-spacing modifiers for characters. For European languages, there are substitutions, e.g.
"u" (U+0075) + "¨" (U+00A8) = "ü" (U+00FC). In this case, solution by @tchrist is quite sufficient.However, for complex writing systems, there is no substitution for non-spacing modifiers. Therefore, NUnit’s
TextMessageWriter.WriteCaretLine(int mismatch)treatsmismatchparameter as a byte offset, while screen representation of Thai string may be shorter than the length of caret line ("-----^").Solution.
Force
WriteCaretLine(int mismatch)to respect non-spacing modifiers, reducingmismatchvalue to the number of non-spacing modifiers occurred before this offset.Implement all supplementary classes that are actually needed only to make your new code invoked.
Along with Thai, I have tested it with Devanagari and Tibetan. It works as expected.
Yet another pitfall. If you’re using NUnit with Visual Studio through ReSharper like I do, you have to configure your Internet Explorer’s fonts (it cannot be managed with R#) so that it used proper monospaced fonts for Thai, Devanagari, etc.
Implementation.
TextMessageWriterand override itsDisplayStringDifferences;ClipExpectedAndActualandFindMismatchPosition– here are non-spacing modifiers are respected; Proper clipping is needed since it may also impact calculation of non-spacing elements.EqualConstraintand override itsWriteMessageTo(MessageWriter writer)so that your MessageWriter was used;The source code goes below. About 80% of the code doesn’t do anything useful, but it’s included due to access levels in original code.