I’m screen-scraping an HTML page which contains: <table border=1 class=searchresult cellpadding=2> <tr><th colspan=2>Last search</th></tr>

Question

0

Asked: May 24, 20262026-05-24T13:21:04+00:00 2026-05-24T13:21:04+00:00

I’m screen-scraping an HTML page which contains: <table border=1 class=searchresult cellpadding=2> <tr><th colspan=2>Last search</th></tr>

0

I’m screen-scraping an HTML page which contains:

<table border=1 class="searchresult" cellpadding=2> 
<tr><th colspan=2>Last search</th></tr> 
<tr><th align=left>Search term</th><td>xxxxxx</td></tr> 
<tr><th align=left>Result</th><td>yyyyyyyy/td></tr> 
</table>

I want to write an XPATH expression which gets me the data cell containing “yyyyyyyy”. I’ve gotten as far as

.//table[@class='searchresult']//tr/th

which gets me a list of all the table-header nodes in the table. I can iterate over them in user code, find the one whose .text is “Results” and then call .getnext() on that to get the table-data. But, is there a cleaner way to do this by writing a more specific XPATH pattern? It seems like there should be, but I haven’t gotten my head that far around XPATH yet to figure out how.

If it matters, I’m doing this in Python with lxml.