I’ve got a big problem with encoding. The code I’m using should work but it doesn’t!
Here is the code:
FileStream fs = new FileStream(saveFile, FileMode.Create, FileAccess.Write, FileShare.None);
System.IO.StreamWriter objWriter;
objWriter = new System.IO.StreamWriter(fs , Encoding.Unicode);
string textLine;
if (System.IO.File.Exists(readFile) == true)
{
System.IO.StreamReader objReader;
objReader = new System.IO.StreamReader(readFile, Encoding.Unicode);
do
{
textLine = objReader.ReadLine();
if (textLine.IndexOf(searchString) != -1)
{
tempString = textLine;
position1 = textLine.IndexOf(searchString);
tempString = textLine.Substring(position1);
if (tempString.IndexOf("(") != -1)
{
position2 = tempString.IndexOf("(");
//MessageBox.Show(tempString.Length.ToString());
tempString = tempString.Substring(0, position2);
}
}
objWriter.WriteLine(textLine);
} while (objReader.Peek() != -1);
}
objWriter.Close();
MessageBox.Show(tempString);
MessageBox.Show("Done!");
I have to read a file that has mixed English characters and some Cyrillic characters, but after reading and processing the file, when I try to save the file to a new location all the cyrilic symbols are “?” or some other unknown symbol. I tried every possible encoding and it does not work!
From the example you posted it seems that the file doesn’t have a BOM and yet it contains cyrillic characters. Without a BOM the
StreamReadercannot guess the correct encoding. So you could assume Windows-1251 encoding since the file contains cyrillic characters (according to the HEX dump you have shown in the comments section).So here’s what you may try: