I’m implementing a web robot that has to get all the links from a

Question

0

Asked: June 11, 20262026-06-11T19:45:09+00:00 2026-06-11T19:45:09+00:00

I’m implementing a web robot that has to get all the links from a

0

I’m implementing a web robot that has to get all the links from a page and select the needed ones. I got it all working except I encountered a probem where a link is inside a “table” or a “span” tag.
Here’s my code snippet:

Document doc = Jsoup.connect(url)
    .timeout(TIMEOUT * 1000)
    .get();
Elements elts = doc.getElementsByTag("a");

And here’s the example HTML:

<table>
  <tr><td><a href="www.example.com"></a></td></tr>
</table>

My code will not fetch such links. Using doc.select doesn’t help too. My question is, how to get all the links from the page?

EDIT: I think I know where the problem is. THe page I’m having trouble with is very badly written, HTML validator throws out tremendous amount of errors. Could this cause problems?