I would do this sort of thing with a macro.…

Question

0

Editorial Team

Asked: May 10, 20262026-05-10T18:14:17+00:00 2026-05-10T18:14:17+00:00

I want to parse a web page in Groovy and extract all of the

0

I want to parse a web page in Groovy and extract all of the href links and the associated text with it.

If the page contained these links:

<a href='http://www.google.com'>Google</a><br /> <a href='http://www.apple.com'>Apple</a>

the output would be:

Google, http://www.google.com<br /> Apple, http://www.apple.com

I’m looking for a Groovy answer. AKA. The easy way!

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

score 0 · Answer 1 · 2026-05-10T18:14:17+00:00

Assuming well-formed XHTML, slurp the xml, collect up all the tags, find the ‘a’ tags, and print out the href and text.

input = '''<html><body> <a href = 'http://www.hjsoft.com/'>John</a> <a href = 'http://www.google.com/'>Google</a> <a href = 'http://www.stackoverflow.com/'>StackOverflow</a> </body></html>'''  doc = new XmlSlurper().parseText(input) doc.depthFirst().collect { it }.findAll { it.name() == 'a' }.each {     println '${it.text()}, ${it.@href.text()}' }

How to approach applying for a job at a company ...

How to handle personal stress caused by utterly incompetent and ...

What is a programmer’s life like?

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions