I’m having a problem wherein xmlValue strips the <br /> tags that I need kept (or transformed to some other character that I can then strsplit on.
Here’s an example:
> f <- htmlParse(getForm("http://sites.target.com/site/en/spot/store_locator_popups.jsp", ajax="true", storeNumber=1889), asText=TRUE)
> xpathSApply(f, "//div[@class=\"sl_results_popup_address\"]", xmlValue)
[1] "1154 S Clark StChicago, IL 60605(312) 212-6300"
Versus the HTML it’s parsing:
<div class="sl_results_popup_address">
1154 S Clark St
<br/>
Chicago, IL 60605
<br/>
(312) 212-6300
</div>
I’ve tried , recursive=FALSE but that doesn’t seem to help.
If they were <p> and </p> line breaks then it would be easier since I could just grab them individually, but with <br/> not wrapping the text I really can’t go that direction. Hoping there’s just an option to reduce the level of stripping done within xmlValue (or maybe the <br/>s are being stripped at the parsing-of-document phase?).
two things may help
so just replace the
brtags with something else or use your original code andif you want to keep the tags