I need to extract all links from a html document having text as the

Question

0

Asked: May 26, 20262026-05-26T15:54:49+00:00 2026-05-26T15:54:49+00:00

I need to extract all links from a html document having text as the

0

I need to extract all links from a html document having text as the inner element and not a reference to an image. Basically I would like to do a doc.select(“//a/attribute::href”) for all elements in a tree where doc.select(“//a/text()”) returns anything. Thanks!

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-26T15:54:50+00:00

Editorial Team

2026-05-26T15:54:50+00:00Added an answer on May 26, 2026 at 3:54 pm

Well you can write conditions in XPath in a predicate in square brackets, e.g. //a[text()]/@href selects the href attributes of all link (a) elements that have at least one text node child. Or if you want to make sure there is no img child element in the link you can use e.g. //a[not(img)]/@href.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I need to extract all links from a html document having text as the

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply