What XPath query could I use to solve the below. I’m actually using nokogiri (in ruby) so ideally the answer would be in the form of a ruby nokogiri form, but else just XPath and I can adapt in.
Required Output
I’m seeking to parse the below HTML (a full html page, but I’ve just copy/pasted the relevant part for clarity), and end up with basically the following:
Phone Number Plan ID
545454545 12345
3434343434 67890
So in the context of Ruby/nokogiri this could be in a Hash for example:
% result = { "545454545" => "12345", "3434343434" => "67890" }
HTML to be Parsed
.
.
.
<form method="post">
<div style='line-height:18px;background-color:#FFFFFF;border: 1px #dedede solid;padding:10px;'>
<table width='90%' border=0>
<tr>
<td width='30%'> Plan ID </td>
<td width='70%'> 12345 </td>
</tr>
<tr>
<td> Phone Number </td>
<td> 545454545 </td>
</tr>
.
.
.
</table>
</div>
<br>
.
.
.
<div style='line-height:18px;background-color:#FFFFFF;border: 1px #dedede solid;padding:10px;'>
<table width='90%' border=0>
<tr>
<td width='30%'> Plan ID </td>
<td width='70%'> 67890 </td>
</tr>
<tr>
<td> Phone Number </td>
<td> 3434343434 </td>
</tr>
.
.
.
</table>
</div>
<br>
Assuming those lines you’ve replaced with periods do not contain data you want to collection, which would mean each table provided a unique result set, the following would work: