team = hxs.select (‘//table[@class=”tablehead”/tbody/tr[contains[.@class, “player”]’)
The structure of the web site I whose table I want to select is as follows:
<html>
<body>
<table>
<tbody>
<tr>
<td>...</td>
<td>...</td>
...
</tr>
</tbody>
</table>
</body>
</html>
Since there are multiple tables in the web site, I only want to select the one whose class is defined as “tablehead”. Also, for that table, I only want to select the tags whose class attributes contain the string “player”. My attempt above looks a bit spotty to begin with. I tried running the crawler, and it says that the line I produced above is an invalid xpath line. Any advice would be nice.
Correcting this results in:
This selects every
trthe string value of whoseclassattribute contains the string"player"and that (thetr) is a child of atbodythat is a child of anytablein the XML document, whoseclassattribute has string value"tablehead".XSLT – based verification:
When this transformation is applied on the provided XML document (made just a little bit more realistic):
the Xpath expression is evaluated and the selected nodes (just one in this case) are copied to the output: