When I try to parse a response from a certain REST API, I’m getting an XmlException saying “Data at the root level is invalid. Line 1, position 1.” Looking at the XML it looks fine, but then examining the first character I see that it is actually a zero-width no-break space (character code 65279 or 0xFEFF).
Is there any good reason for that character to be there? Maybe I’m supposed to be setting a different Encoding when I make my request? Currently I’m using Encoding.UTF8.
I’ve thought about just removing the character from the string, or asking the developer of the REST API to fix it, but before I do either of those things I wanted to check if there is a valid reason for that character to be there. I’m no unicode expert. Is there something different I should be doing?
Edit: I suspected that it might be something like that (BOM). So, the question becomes, should I have to deal with this character specially? I’ve tried loading the XML two ways and both throw the same exception:
public static User GetUser()
{
WebClient req = new WebClient();
req.Encoding = Encoding.UTF8;
string response = req.DownloadString(url);
XmlSerializer ser = new XmlSerializer(typeof(User));
User user = ser.Deserialize(new StringReader(response)) as User;
XElement xUser = XElement.Parse(response);
...
return user;
}
Instead of using Encoding.UTF8, create your own UTF-8 encoder, using the constructor overload that lets you specify whether or not the BOM is to be emitted:
I believe that will do the trick for you.
Amended to Note: The following will work: