I’m having a problem comparing strings in a Unit Test in C# 4.0 using

Question

0

Asked: May 15, 20262026-05-15T01:02:59+00:00 2026-05-15T01:02:59+00:00

I’m having a problem comparing strings in a Unit Test in C# 4.0 using

0

I’m having a problem comparing strings in a Unit Test in C# 4.0 using Visual Studio 2010. This same test case works properly in Visual Studio 2008 (with C# 3.5).

Here’s the relevant code snippet:

byte[] rawData = GetData();
string data = Encoding.UTF8.GetString(rawData);

Assert.AreEqual("Constant", data, false, CultureInfo.InvariantCulture);

While debugging this test, the data string appears to the naked eye to contain exactly the same string as the literal. When I called data.ToCharArray(), I noticed that the first byte of the string data is the value 65279 which is the UTF-8 Byte Order Marker. What I don’t understand is why Encoding.UTF8.GetString() keeps this byte around.

How do I get Encoding.UTF8.GetString() to not put the Byte Order Marker in the resulting string?

Update: The problem was that GetData(), which reads a file from disk, reads the data from the file using FileStream.readbytes(). I corrected this by using a StreamReader and converting the string to bytes using Encoding.UTF8.GetBytes(), which is what it should’ve been doing in the first place! Thanks for all the help.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-15T01:02:59+00:00

Well, I assume it’s because the raw binary data includes the BOM. You could always remove the BOM yourself after decoding, if you don’t want it – but you should consider whether the byte array should consider the BOM to start with.

EDIT: Alternatively, you could use a StreamReader to perform the decoding. Here’s an example, showing the same byte array being converted into two characters using Encoding.GetString or one character via a StreamReader:

using System;
using System.IO;
using System.Text;

class Test
{
    static void Main()
    {
        byte[] withBom = { 0xef, 0xbb, 0xbf, 0x41 };
        string viaEncoding = Encoding.UTF8.GetString(withBom);
        Console.WriteLine(viaEncoding.Length);

        string viaStreamReader;
        using (StreamReader reader = new StreamReader
               (new MemoryStream(withBom), Encoding.UTF8))
        {
            viaStreamReader = reader.ReadToEnd();           
        }
        Console.WriteLine(viaStreamReader.Length);
    }
}

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m having a problem comparing strings in a Unit Test in C# 4.0 using

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply