I am scraping a website, and I need to get the numerical values from this HTMLdocument:
<td>
<span style=" color: red; font-weight: bold;"> 1.950</span>
</td>
<td> 3.400</td>
I need to extract both 1.950 and 3.400, but I can’t figure out how to do it, when the one value is only in a , but the other one has a span as well. Is there a general way to get both the parent and the child of the path? I am using the scrapy framework with the HtmlXPathSelector. I can use the path /td/text() for one, and /td/span/text() for the other, but I need to do it in one query. How can this be achieved?
You can try with :
/td//text()to select every text node that are descendants of atd