I have this xPath expression that I’m putting into htmlCleaner: //table[@class=’StandardTable’]/tbody/tr[position()>1]/td[2]/a/img Now, my issue

Question

0

Asked: May 27, 20262026-05-27T16:42:08+00:00 2026-05-27T16:42:08+00:00

I have this xPath expression that I’m putting into htmlCleaner: //table[@class=’StandardTable’]/tbody/tr[position()>1]/td[2]/a/img Now, my issue

0

I have this xPath expression that I’m putting into htmlCleaner:

 //table[@class='StandardTable']/tbody/tr[position()>1]/td[2]/a/img

Now, my issue is that it changes, and some times the /a/img element is not present. So I would like an expression that gets all elements

//table[@class='StandardTable']/tbody/tr[position()>1]/td[2]/a/img

when /a/img is present, and

//table[@class='StandardTable']/tbody/tr[position()>1]/td[2]

when /a/img is not present.

Does anyone hav any idea how to do this? I found in another question something that looks like it might help me

descendant-or-self::*[self::body or self::span/parent::body]

but I don’t understand it.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-27T16:42:08+00:00

You can select the union of two mutually exclusive expressions (notice the | union operator):

//table[@class='StandardTable']/tbody/tr[position()>1]/td[2]/a/img|
//table[@class='StandardTable']/tbody/tr[position()>1]/td[2][not(a/img)]

When the first expression returns nodes, the second one will not (and the other way around), which means you’ll always get just the required nodes.

From your comments on @Dimitre’s answer, I see that HTMLCleaner doesn’t fully support XPath 1.0. You don’t really need it to. You just need HTMLCleaner to parse input that isn’t well-formed. Once it has done that job, convert its output into a standard org.w3c.dom.Document and treat it as XML.

Here’s a conversion example:

TagNode tagNode = new HtmlCleaner().clean("<html><div><p>test");
Document doc = new DomSerializer(new CleanerProperties()).createDOM(tagNode);

From here on out, just use JAXP with whatever implementation you want:

XPath xpath = XPathFactory.newInstance().newXPath();
Node node = (Node) xpath.evaluate("/html/body/div/p[not(child::*)]", 
                       doc, XPathConstants.NODE);
System.out.println(node.getTextContent());

Output:

test

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have this xPath expression that I’m putting into htmlCleaner: //table[@class=’StandardTable’]/tbody/tr[position()>1]/td[2]/a/img Now, my issue

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply