Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7768551
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 1, 20262026-06-01T15:56:34+00:00 2026-06-01T15:56:34+00:00

Question How do I remove class attributes from html using python and lxml? Example

  • 0

Question

How do I remove class attributes from html using python and lxml?

Example

I have:

<p class="DumbClass">Lorem ipsum dolor sit amet, consectetur adipisicing elit</p>

I want:

<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit</p>

What I’ve tried so far

I’ve checked out lxml.html.clean.Cleaner however, it does not have a method to strip out class attributes. You can set safe_attrs_only=True however, this does not remove the class attribute.

Significant searching has turned up nothing workable. I think the fact that class is used in both html and python further muddies search results. Many of the results also seem to deal strictly with xml as well.

I’m open to other python modules that offer humane interfaces as well.

Thanks much.


Solution

Thanks to @Dan Roberts answer below, I came up with the following solution. Presented for folks arriving here in the future trying to solve the same problem.

import lxml.html

# Our html string we want to remove the class attribute from
html_string = '<p class="DumbClass">Lorem ipsum dolor sit amet, consectetur adipisicing elit</p>'

# Parse the html
html = lxml.html.fromstring(html_string)

# Print out our "Before"
print lxml.html.tostring(html)

# .xpath below gives us a list of all elements that have a class attribute
# xpath syntax explained:
# // = select all tags that match our expression regardless of location in doc
# * = match any tag
# [@class] = match all class attributes
for tag in html.xpath('//*[@class]'):
    # For each element with a class attribute, remove that class attribute
    tag.attrib.pop('class')

# Print out our "After"
print lxml.html.tostring(html)
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-01T15:56:35+00:00Added an answer on June 1, 2026 at 3:56 pm

    I can’t test this at the moment but this appears to be the general idea

    for tag in node.xpath('//*[@class]'):
        tag.attrib.pop('class')
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I was looking at this question: How to remove duplicate elements from an xml
I have a simple question how do I remove a \ with regex? Thanks
Question 1) I have a control to which I add an attribute from the
I have renamed a python class that is part of a library. I am
I have an <abbr> tag with class timeago in my HTML. When I set
I initially thought a regex to remove YUI3 classNames (or whole class attributes) and
Question about DOM* class createXXX methods in C++. Do I have to do anything
This is a simple question How do you remove the space between the content
At this question: mod rewrite to remove file extension, add trailing slash, remove www
My question is simple. Should I remove Log.d/e/i/v and e.printStackTrace instructions before uploading my

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.