I have an xml file like this: <data> <entry> <word>ABC</word> (this) </entry> <entry> <word>ABC</word>

Question

0

Asked: June 18, 20262026-06-18T05:58:53+00:00 2026-06-18T05:58:53+00:00

I have an xml file like this: <data> <entry> <word>ABC</word> (this) </entry> <entry> <word>ABC</word>

0

I have an xml file like this:

<data>
      <entry>
           <word>ABC</word> (this)
      </entry>
      <entry>
           <word>ABC</word> [not this]
      </entry>
</data>

I want to select nodes whose descendant include “(“, and move (.*) to the text of <entry>. That is:

<data>
      <entry>
           (this)
           <word>ABC</word>
      </entry>
      <entry>
           <word>ABC</word> [not this]
      </entry>
</data>

I’m using lxml. And I tried:

 import lxml.etree as ET
 data = ET.parse('sample.xml')
 for entry in data.iter('entry'):
      A = entry.xpath('.//*[text() = ".*(.*?)"]')

But it doesn’t work. “(” can appear as a tail of a node or as a text of a node.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-18T05:58:55+00:00

There are a few problems here:

First, you’re trying to use xpath to do regex matching, but you’re using =. Your regex is also improperly formatted. To actually do regex matching in xpath, you’d need to do something like:

import lxml.etree as ET
data = ET.parse('sample.xml')
regexpNS = "http://exslt.org/regular-expressions"
for entry in data.iter('entry'):
    A = entry.xpath('.//*[re:test(text(), ".*\(.*\).*")]',
                    namespaces={'re':regexpNS})

Unfortunately, this won’t actually work for you because you want text in the tail, which isn’t included in text(). The lxml docs make it look like this should be included in string(), but I tried that and it doesn’t work either. I can’t find any way to do this using xpath and lxml.

So, here’s a way to do it with more Python and less xpath:

 import re
 import lxml.etree as ET
 rx = re.compile('.*\(.*\).*')
 data = ET.parse('sample.xml')
 for entry in data.iter('entry'):
    for child in entry.xpath('.//*'):
        if rx.match(child.text + child.tail):
            # Your manipulations go here
            print child

In either case, a happy side effect is that this regex is having a totally rockin’ good time in the snow: .*\(.*\).*.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have an xml file like this: <data> <entry> <word>ABC</word> (this) </entry> <entry> <word>ABC</word>

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply