So I am trying to develop a program that will parse a website for data, send that data into variable that I can then use for functions inside the program.
Specifically I’m trying to parse this page (Click the debuffs tab)
http://worldoflogs.com/reports/rt-1smdoscr7neq0k6b/spell/94075/
The source is pretty simple and looks like this.
<td><a href='/reports/rt-1smdoscr7neq0k6b/details/62/' class='actor'><span class='Warrior'>Zonnza</span></a></td>
<td>100</td>
</tr>
<tr>
<td><a href='/reports/rt-1smdoscr7neq0k6b/details/3/' class='actor'><span class='DeathKnight'>Fillzholez</span></a></td>
<td>89</td>
</tr>
While I only want the numbers and name, ex what is between <td></td> and between the <span class=''></span> tags. Is there anyway to do what I’m looking for?
Any help would be greatly appreciated.
I’d look into Tag Soup. It’s a parser for HTML that can cope with all the horrible HTML that’s out there. There’s a C++ port of it available too (haven’t used that so can’t comment on how stable it is).