I have a string containing the HTML of a page as received by a GET request
Dim http
Set http = CreateObject("MSXML2.XMLHTTP")
http.open "GET", "http://www.example.com", False
http.send
// http.responseText is the string
How can I convert this string to a Document object? I would like a more natural way of parsing the HTML than manually searching through it.
If it is valid XHTML, then you can load it into a DOMDocument with LoadXml(). Misc example: http://msdn.microsoft.com/en-us/library/ms756007(v=vs.85).aspx
Otherwise, you could use some sort of browser COM object (as previously answered here: How do you extract data from vendor website in vbscript?), but NOTE: this is not something you’d want to do server-side in ASP pages, as it’s likely to result in all sorts of leaked resources and instability.
Finally, you could use a dedicated HTML-parsing third-party COM object, eg http://www.miken.com/htmlzap/ (I am not recommending this, it’s the result of a v quick google search – but it might be great for all I know).