I have a necessity to sort a given HTML table of the following structure, in Python.
<table>
<tr>
<td><a href="#">ABCD</a></td>
<td>A23BND</td>
<td><a title="ABCD">345345</td>
</tr>
<tr>
<td><a href="#">EFG</a></td>
<td>Add4D</td>
<td><a title="EFG">3432</td>
</tr>
<tr>
<td><a href="#">HG</a></td>
<td>GJJ778</td>
<td><a title="HG">2341333</td>
</tr>
</table>
I am doing something like this:
container = tree.findall("tr")
strOut = ""
data = []
for elem in container:
key = elem.findtext(colName)
data.append((key, elem))
data.sort()
The problem is that it sorts by the text inside the <td>. I want to be able to sort by the anchor value and not href.
What can I do to achieve that? Thanks a lot.
It sorts by the text because that’s what you’re extracting as the key when you do
I imagine
colNameis some tag string, andfindtextwill just find the text of the first subelement matching that tag. If what you want instead is to use as the key the value of some attribute (e.g.title?) of an<a>,Would do that. Exactly what do you want to use as the key?