So I have this code inside of a class file:
Document requestData (String url, [String postVars, bool pauseApp = false, onSuccess(Document ht)]) {
HttpRequest html = new HttpRequest();
html.open((postVars == null ? 'GET' : 'POST'), url, async: !pauseApp);
html.send(postVars);
if (pauseApp == true) { return html.responseXML; }
else { html.on.readyStateChange.add((Event e) {
if (html.readyState == HttpRequest.DONE && (html.status == 200 || html.status == 0)) {
try {
//HERE IS WHERE THE ISSUE IS ----V
DOMParser d = new DOMParser();
onSuccess(d.parseFromString(html.responseText,"text/html"));
}
catch (e) {
print("Error on requestData($url) async = $pauseApp - $e");
}
}
});
}
}
(Entire source for reference: http://pastebin.com/z21PM7r0 – I am using the dartium flag ‘–disable-web-security’ to allow cross server requests)
The issue is basically, the requests responseXML returns null whereas the responseText returns the HTML as expected. To combat this I attempted to use the DOM parser and that failed.
As I don’t own or control the server I need to connect to I cannot fix the html myself. The issue I assume is because its malformed.
Here is the code of the website I am trying to parse using the function above:
http://pastebin.com/KvMN9AuF
W3 Validator gives: 193 Errors, 16 warning(s)
Does anybody know how to combat this issue? Or is this something I am just going to have to give up on…
Try html5lib. It’s a spec-compliant html5 parser in pure Dart. You should be able to read in the malformed html, and then use document.outerHtml to get a well-formed String.