I have an xml file in which it is possible that the following occurs:

Question

0

Asked: May 22, 20262026-05-22T21:39:10+00:00 2026-05-22T21:39:10+00:00

I have an xml file in which it is possible that the following occurs:

0

I have an xml file in which it is possible that the following occurs:

...
<a><b>This is</b> some text about <c>some</c> issue I have, parsing xml</a>
...

Edit: Let’s assume, the tags could be nested more than only level, meaning

<a><b><c>...</c>...</b>...</a>

I came up with this using the python lxml.etree library.

context = etree.iterparse(PATH_TO_XML, dtd_validation=True, events=("end",))
for event, element in context:
    tag = element.tag
    if tag == "a":
        print element.text # is empty :/
        mystring = element.xpath("string()")
        ...

But somehow it goes wrong.

What I want is the whole string

"This is some text about some issue I have, parsing xml"

But I only get an empty string. Any suggestions? Thanks!

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-22T21:39:11+00:00

Editorial Team

2026-05-22T21:39:11+00:00Added an answer on May 22, 2026 at 9:39 pm

This question has been asked many times.

You can use lxml.html.text_content() method.

import lxml.html
t = lxml.html.fromstring("...")
t.text_content()

REF: Filter out HTML tags and resolve entities in python

OR use lxml.etree.strip_tags() method.

REF: In lxml, how do I remove a tag but retain all contents?

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have an xml file in which it is possible that the following occurs:

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply