Possible Duplicate: Accented characters not correctly imported with BULK INSERT A .net program running

Question

0

Editorial Team

Asked: June 16, 20262026-06-16T01:45:50+00:00 2026-06-16T01:45:50+00:00

Possible Duplicate: Accented characters not correctly imported with BULK INSERT A .net program running

0

Possible Duplicate:
Accented characters not correctly imported with BULK INSERT

A .net program running in my system provides me with a csv file. I would like to know the encoding of that file.

The csv file has é,ä,å,æ characters but is shown as �(UTF8-with BOM). Is there any possibility that I can bet back these characters to its original or its English like characters.

The csv file is created by a .net program running in the same machine under same user but after the creation of the file I cannot see the original characters.

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-16T01:45:54+00:00

If you see �, when you decode the file as UTF-8, but you see ï¿½, when you decode it as Windows-1252, then the file literally contains �. I.E. It literally contains the bytes 0xEF 0xBF 0xBD (UTF-8 for �) . Therefore the data is unrecoverable at this point.

This happens when physical encoding of some byte stream does not match the encoding used to decode it. So for instance, the physical encoding is Windows-1252, then a program decodes it to internal string using UTF-8 with replacement fallback. Now, the string internally contains �, but it is not inspected and is written to a file as UTF-8, and the resulting file is what you have.

To avoid the original screw up, it is a good idea to use exception fallback instead of replacement fallback when decoding files, for example:

Encoding enc = Encoding.GetEncoding(
    "UTF-8",
    new EncoderExceptionFallback(),
    new DecoderExceptionFallback()
);

try
{
    File.ReadAllText(@"myfile.csv", enc);
}
catch (DecoderFallbackException e)
{
    Console.WriteLine("This file was not encoded in UTF-8, try some other encoding");
}

Now you get an exception when the file isn’t UTF-8 and you can either try other encoding or let the user know that his file must be in UTF-8.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Possible Duplicate: Accented characters not correctly imported with BULK INSERT A .net program running

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply