I’m trying to obtain the correct unicode characters represented by this string:
string originalString = "\u0605\u04c3\u5000\u0000\u5000\ufd00\u4400\ud500\u7600\ud300\u4f00\ubc00\u0c00\u2d00\u4000\ue400\u0e00\u7400\u4800\ub700\u1d00\u1300\ue900\u6000\u4c00\ufb00\u9900\u3900\ud900\u6700\uae00\ueb00\u8f00\u2800\u0200\ub300\u5c00\ufe00\u0100\u3d00\u9100\u3000\u0300\u1600\u0100\u7000\u6200\u8e00\u1d00\u8e00\u6200\ua900\u6300\uc800\u0900\ub700\ub000\u6000\ue400\u9200\u3f00\u9100\u8d00\uef00\u3600\u0100\u9e00\u0081";
If I hard-code it in the cs file, I can see in debug mode that it shows the correct characters, but if I have the exact string written in a file and I try to read it, it shows the string as it is in the file.
TextReader tr = new StreamReader("c:\\test.txt");
string tmpString = tr.ReadLine();
tr.Close();
byte[] array = Encoding.Unicode.GetBytes(tmpString );
string finalResult = Encoding.Unicode.GetString(array);
How can I make the finalResult string have the correct unicode characters?
Thanks in advance
Gonçalo
EDIT: Already tried placing
TextReader tr = new StreamReader("c:\\test.txt",Encoding.Unicode);
but the characters are different from the correct ones.
Does your file actually contain the content:
If so, you need to convert each sequence to its corresponding unicode character