This is not the first time I’ve encountered a problem while using htmlParse in the XML library, but in the past I’ve just given up and used a regex to parse what I needed instead. I’d rather do it via parsing the XML/XHTML, since as we all know regexs aren’t parsers.
That said, I find the error messages from the parse commands to be non-helpful at best, and I have no idea how to proceed. For instance:
> htmlParse(getForm("http://www.takecarehealth.com/LocationSearchResults.aspx", location_query="Deer Park",location_distance=50))
Error in htmlParse(getForm("http://www.takecarehealth.com/LocationSearchResults.aspx", :
File
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head id="ctl00_Head1">
<title></title>
<script language="JavaScript" type="text/javascript">
var s_pageName = document.title;
var s_channel = "Take Care";
var s_campaign = "";
var s_eVar1 = ""
var s_eVar2 = ""
var s_eVar22 = ""
var s_eVar23 = ""
</script>
<meta name="keywords" content="take care clinic, walgreens clinic, walgreens take care clinic, take care health, urgent care clinic, walk in clinic" />
<meta name="description" content="Information about simple, quality healthcare for the whole family from Take Care Clinics at select Walgreens, including Take Care Clinic hours, providers, offers, insurance and quality of care." />
<link rel="shortcut icon" hre
I’m glad it sees something in there, but where do I drill down past “Error: File”?
Note this is, as far as I can tell, well-formed XHTML. When I visit the link manually I can run xpaths on it and Firebug does not complain.
How do I debug errors from htmlParse like this?
Downloading first then passing to XML package seems to work
or directly
also seems fine