I’m parsing an HTML file into a well-formed XML document using NekoHTML parser. However

Question

0

Asked: May 30, 20262026-05-30T00:26:29+00:00 2026-05-30T00:26:29+00:00

I’m parsing an HTML file into a well-formed XML document using NekoHTML parser. However

0

I’m parsing an HTML file into a well-formed XML document using NekoHTML parser. However I can’t quite figure out the GPath so that I can identify the table that has the “Settings” string.

def parser = new org.cyberneko.html.parsers.SAXParser()
parser.setFeature('http://xml.org/sax/features/namespaces', false)

    def html = 
    ''' 
        <html>
            <title>Hiya!</title>
        </html>
        <body>
            <table>
                <tr>
                    <th colspan='3'>Settings</th>
                    <td>First cell r1</td>
                    <td>Second cell r1</td>
                </tr>
            </table>
            <table>
                <tr>
                    <th colspan='3'>Other Settings</th>
                    <td>First cell r2</td>
                    <td>Second cell r2</td>
                </tr>
            </table>
    '''

    def slurper = new XmlSlurper(parser)
    def page = slurper.parseText(html)

In this sample, the first table should be selected so that I can iterate over other row values in it. Can someone help me with this GPath please?

EDIT: Side question – why does

println page.HTML.HEAD.TITLE

print an empty string, shouldn’t it return the title?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-30T00:26:31+00:00

Editorial Team

2026-05-30T00:26:31+00:00Added an answer on May 30, 2026 at 12:26 am

To get the table with ‘Settings’ in the header, you should be able to do:

def settingsTableNode = page.BODY.TABLE.find { table ->
  table.TBODY.TR.TH.text() == 'Settings'
}

page points to the root of the document, so you don’t need the HTML. All you should need to do is:
```
println page.HEAD.TITLE
```

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m parsing an HTML file into a well-formed XML document using NekoHTML parser. However

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply