The best way to help this situation is to file…

Question

0

Asked: May 12, 20262026-05-12T08:50:27+00:00 2026-05-12T08:50:27+00:00

In C#, I have a string that I’m obtaining from WebClient.DownloadString. I’ve tried setting

0

In C#, I have a string that I’m obtaining from WebClient.DownloadString. I’ve tried setting client.Encoding to new UTF8Encoding(false), but that’s made no difference – I still end up with a byte order mark for UTF-8 at the beginning of the result string. I need to remove this (to parse the resulting XML with LINQ), and want to do so in memory.

So I have a string that starts with \x00EF\x00BB\x00BF, and I want to remove that if it exists. Right now I’m using

if (xml.StartsWith(ByteOrderMarkUtf8))
{
    xml = xml.Remove(0, ByteOrderMarkUtf8.Length);
}

but that just feels wrong. I’ve tried all sorts of code with streams, GetBytes, and encodings, and nothing works. Can anyone provide the "right" algorithm to strip a BOM from a string?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-12T08:50:27+00:00

Editorial Team

2026-05-12T08:50:27+00:00Added an answer on May 12, 2026 at 8:50 am

If the variable xml is of type string, you did something wrong already – in a character string, the BOM should not be represented as three separate characters, but as a single code point.

Instead of using DownloadString, use DownloadData, and parse byte arrays instead. The XML parser should recognize the BOM itself, and skip it (except for auto-detecting the document encoding as UTF-8).

0

Reply
Share
Share

- Report

How to approach applying for a job at a company ...

How to handle personal stress caused by utterly incompetent and ...

What is a programmer’s life like?

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions