I’m using c# to interact with a database that has an exposed REST API. The table that I’m interested in contains forum posts, some of which themselves contain xml.
Whenever my result set contains a post that has xml, my application throws an error as follows:
Exception Details: System.Xml.XmlException: ‘>’ is an unexpected token. The expected token is ‘”‘ or ”’. Line 1, position 62.
And this is the line that fails:
Line 44: ds.ReadXml(xmlData);
And this is the code I’m using:
var webClient = new WebClient();
string searchString = searchValue.Text;
string requestUrl = "http://myserver/restapi.ashx/search.xml?pagesize=4&pageindex=0&query=";
requestUrl += searchString;
XmlReaderSettings settings = new XmlReaderSettings();
settings.ProhibitDtd = false;
XmlReader xmlData = XmlReader.Create(webClient.OpenRead(requestUrl),settings);
DataSet ds = new DataSet();
ds.ReadXml(xmlData);
Repeater1.DataSource = ds.Tables[1];
Repeater1.DataBind();
And this is the type of XML record that it’s choking on (the stuff in the node is causing the problem):
<SearchResults PageSize="1" PageIndex="0" TotalCount="342">
<SearchResult>
<ContentId>994</ContentId>
<Title>Help Files: What are they written in?</Title>
<Url>http://myserver/linktest.aspx</Url>
<Date>2008-10-16T16:18:00+01:00</Date><ContentType>post</ContentType>
<Body><div class="ForumPostBodyArea"> <div class="ForumPostContentText"> <p>Can anyone see anything obviously wrong with this xml, when its fired to CRM Its creating 13 null records.</p> <p><?xml version="1.0" encoding="UTF-8"?><soap:Envelope xmlns:typens="<a href="http://tempuri.org/type">http://tempuri.org/type</a>" soap:encodingStyle="<a href="http://schemas.xmlsoap.org/soap/encoding/">http://schemas.xmlsoap.org/soap/encoding/</a>" xmlns:soap="<a href="http://schemas.xmlsoap.org/soap/envelope/">http://schemas.xmlsoap.org/soap/envelope/</a>" xmlns:xsi="<a href="http://www.w3.org/2001/XMLSchema-instance">http://www.w3.org/2001/XMLSchema-instance</a>" xmlns:soapenc="<a href="http://schemas.xmlsoap.org/soap/encoding/">http://schemas.xmlsoap.org/soap/encoding/</a>" xmlns:wsdlns="<a href="http://tempuri.org/wsdl/">http://tempuri.org/wsdl/</a>" xmlns:xsd="<a href="http://www.w3.org/2001/XMLSchema%22%3E%3Csoap:Header%3E%3CSessionHeader%3E%3CsessionId">http://www.w3.org/2001/XMLSchema"><soap:Header><SessionHeader><sessionId</a> xsi:type="xsd:long">18208442035524</sessionId></SessionHeader></soap:Header><soap:Body><typens:add><entityname xsi:type="xsd:string">lead</entityname><records xsi:nil="true" xsi:type="typens:ewarebase" /><status xsi:type="xsd:string">PreRegistration</status><requester xsi:type="xsd:string">Mimnagh</requester><personfirstname xsi:type="xsd:string">Sean</personfirstname><personlastname xsi:type="xsd:string">Test2</personlastname><personsalutation xsi:type="xsd:string">Mr</personsalutation><details xsi:type="xsd:string">test project details</details><description xsi:type="xsd:string">test description details</description><comments xsi:type="xsd:string">test project comments</comments><personemail xsi:type="xsd:string">smimnagh@mac.com</personemail><personphonenumber xsi:type="xsd:string">12334566777</personphonenumber><type xsi:type="xsd:string">PreReg</type><companyname xsi:type="xsd:string">Site Client</companyname></typens:add></soap:Body></soap:Envelope></p> <p>Many thanks</p> </div> </div>
</Body>
<Tags>
<Tag>xml</Tag>
</Tags>
<IndexedAt>2010-07-08T11:53:46.848+01:00</IndexedAt>
</SearchResult>
</SearchResults>
Is there something that I can do with the xmlreader to make it ignore whatever’s causing the problem?
Please note that I can’t change the XML prior to consuming it – so if it’s malformed then I wonder if there’s a way to ignore or modify that particular record without generating an error?
Thanks!
It looks like some of your quotes need escaping in the contents of some of your elements. Try using
for quote marks that aren’t wrapping attribute values.
UPDATE:
Because the data you want to read isn’t strictly XML (it’s nearly XML) you’re best bet is to
If you have to go with point 2, the simplest thing that pops into my head is to read the characters of the ‘XML’ counting in and out of angle brackets. If you find any ” characters and you’re not within any angle brackets, replace the ” with
But note that doing that is a complete last resort.