A piece of HTML that I’m trying to parse contains some attributes values without

Question

0

Asked: June 10, 20262026-06-10T01:50:19+00:00 2026-06-10T01:50:19+00:00

A piece of HTML that I’m trying to parse contains some attributes values without

0

A piece of HTML that I’m trying to parse contains some attributes values without quotation marks, for example with width and height attributes:

<img src="/static/logo.png" width=75 height=90 />

In the C# code, the reader reads until the next anchor tag.

while (reader.ReadToFollowing("a"))

This statement reports a XmlException:

'75' is an unexpected token. The expected token is '"' or '''. Line 16, position 37.

Is there some XmlReaderSetting to make the XmlReader more lenient? I do not have control over the generated HTML.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-10T01:50:21+00:00

In order to read HTML, you’ll need a reader designed for that purpose. The HtmlAgilityPack can help you here, as can the SgmlReader referred to in this answer to a related question.

HTML is not XML. They are both based on SGML, but follow different rules. XML has much stricter rules than HTML, which include the need to close all tags and for attributes to be surrounded with single or double quotes. Therefore, unless you are parsing XML-compliant XHTML, XmlReader will not work for you.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

A piece of HTML that I’m trying to parse contains some attributes values without

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply