I have a two part problem that needs fixing. I’ll try my best to describe it then break down what I “think” the steps are.
I am trying to get a specific table in a webpage and email it to myself.
At the moment what I am trying is to use GNU\Win32 wget.exe (I’d rather use PowerShell natively but for some reason I couldn’t, perhaps because the method I was using couldn’t render the ASPX page?)
Using wget I was able to make a local html version of the ASPX page.
Now I have been attempting to parse the file and extract a specific table. In this particular case the table begins with <table border="0" cellpadding="2" cellspacing="2" width="300px"> and ends with </table> and there are no nested tables.
I’ve thrown some regex at my problem (yes I know regex may not be the tool I need here) but to no avail.
—Ammended
Here is where I am at now…
$content = (new-object System.Net.WebClient).DownloadString($url)
$found = $content -cmatch '(?si)<table border="0" cellpadding="2" cellspacing="2" width="300px"[^>]*>(.*?)Total Queries</td>(.*?)</tr>(.*?)</table>'
$result = $matches[3]
$result
I’ve done this sort of thing with PowerShell. It is pretty straightforward:
Just substitute
widthforborderand300pxfor0for your regex e.g.:Ih the case of matching multiple tables, you have to switch from -match, which is a boolean operator just looking to find a single match to Select-String which can find all matches e.g.:
Essentially all matches will be in the $_.Matches collection. If you know that the table is always the third one you can access like so: