Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

You must login to ask a question.

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Search

Ask A Question

0

Editorial Team

Asked: May 15, 20262026-05-15T21:27:41+00:00 2026-05-15T21:27:41+00:00

I’m working in Python with HTML that looks like this. I’m parsing with lxml,

0

I’m working in Python with HTML that looks like this. I’m parsing with lxml, but could equally happily use pyquery:

<p><span class="Title">Name</span>Dave Davies</p>
<p><span class="Title">Address</span>123 Greyfriars Road, London</p>

Pulling out ‘Name’ and ‘Address’ is dead easy, whatever library I use, but how do I get the remainder of the text – i.e. ‘Dave Davies’?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team

2026-05-15T21:27:41+00:00Added an answer on May 15, 2026 at 9:27 pm

Each Element can have a text and a tail attribute (in the link, search for the word “tail”):

import lxml.etree

content='''\
<p><span class="Title">Name</span>Dave Davies</p>
<p><span class="Title">Address</span>123 Greyfriars Road, London</p>'''


root=lxml.etree.fromstring(content,parser=lxml.etree.HTMLParser())
for elt in root.findall('**/span'):
    print(elt.text, elt.tail)

# ('Name', 'Dave Davies')
# ('Address', '123 Greyfriars Road, London')

0

Reply
Share
Share

- Report