I saw this post on Jon Skeet’s blog where he talks about string reversing. I wanted to try the example he showed myself, but it seems to work… which leads me to believe that I have no idea how to create a string that contains a surrogate pair which will actually cause the string reversal to fail. How does one actually go about creating a string with a surrogate pair in it so that I can see the failure myself?
Share
The term “surrogate pair” refers to a means of encoding Unicode characters with high code-points in the
UTF-16encoding scheme (see this page for more information);In the
Unicodecharacter encoding, characters are mapped to values between0x000000and0x10FFFF. Internally, aUTF-16encoding scheme is used to store strings ofUnicodetext in which two-byte (16-bit) code sequences are considered. Since two bytes can only contain the range of characters from0x0000to0xFFFF, some additional complexity is used to store values above this range (0x010000to0x10FFFF).This is done using pairs of code points known as surrogates. The surrogate characters are classified in two distinct ranges known as
low surrogatesandhigh surrogates, depending on whether they are allowed at the start or the end of the two-code sequence.Try this yourself:
or this, if you want to stick with the blog example:
nnd then check the string values with the debugger. Jon Skeet is damn right… strings and dates seem easy but they are absolutely NOT.