I’m working on a C# console application. The ultimate goal is finding a specific row in a table, then clicking on a link to download a file generated by an old web app. (This is pretty old, so there’s no API for me to use)
The table follows a structure as follows:
<html>
<head>
<title>Test Table Page</title>
</head>
<body>
<table border="1" cellpadding="3" cellspacing="5">
<tr>
<td>Test Row One</td>
<td>Test Content</td>
</tr>
<tr>
<td>Test Row Two</td>
<td>Test Content</td>
</tr>
<tr>
<td>Test Row Three</td>
<td>Test Content</td>
</tr>
</table>
</body>
What I want to do is get the Test Content associated with Test Row Two. I need to go by the name of a report in an adjacent cell.
If you think that the HTML is going to be XML-compliant, you could just use an XML parser like below (with XPath). Personally, I like to avoid HTML parsers because they are big and complicated. Like using a chainsaw to snap a twig in half. Sometimes, nothing else will do, but if there’s a simpler solution then try that first.
Relevant Code Snippet:
Full Source Code: