Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7506745
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 29, 20262026-05-29T22:20:35+00:00 2026-05-29T22:20:35+00:00

I am fetching some html table rows with BeautifulSoup with this piece of code:

  • 0

I am fetching some html table rows with BeautifulSoup with this piece of code:

from bs4 import BeautifulSoup
import urllib2
import re

page = urllib2.urlopen('www.something.bla')
soup = BeautifulSoup(page)
rows = soup.findAll('tr', attrs={'class': re.compile('class1.*')})

This is what I get as a result:

<tr class="class1 class2 class3">...</tr>
<tr class="class1 class2 class3">...</tr>
<tr class="class1 class5">...</tr>
<tr class="class1_a class5_a">...</tr>
<tr class="class1 class5">...</tr>
<tr class="class1_a class5_a">...</tr>
<!-- etc. -->

However, I’d like to exclude (or not select them in the first place) those rows which have class1 class2 class3 as an attribute.

How can I do that?
Thanks for help!

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-29T22:20:37+00:00Added an answer on May 29, 2026 at 10:20 pm

    Perhaps it’s easier without regex. This works with BeautifulSoup 3:

    from BeautifulSoup import BeautifulSoup
    
    page = """
    <tr class="class1 class2 class3">1</tr>
    <tr class="class1 class2 class3">2</tr>
    <tr class="class1 class5">3</tr>
    <tr class="class1_a class5_a">4</tr>
    <tr class="class1 class5">5</tr>
    <tr class="class1_a class5_a">6</tr>
    <tr>7</tr>"""
    
    def cond(x):
        if x:
            return x.startswith("class1") and not "class2 class3" in x
        else:
            return False
    
    soup = BeautifulSoup(page)
    rows = soup.findAll('tr', {'class': cond})
    
    for row in rows:
        print row
    

    =>

    <tr class="class1 class5">3</tr>
    <tr class="class1_a class5_a">4</tr>
    <tr class="class1 class5">5</tr>
    <tr class="class1_a class5_a">6</tr>
    

    With BeautifulSoup 4, I was able to make it work as follows:

    import re
    from bs4 import BeautifulSoup
    
    page = """
    <tr class="class1 class2 class3">1</tr>
    <tr class="class1 class2 class3">2</tr>
    <tr class="class1 class5">3</tr>
    <tr class="class1_a class5_a">4</tr>
    <tr class="class1 class5">5</tr>
    <tr class="class1_a class5_a">6</tr>
    <tr>7</tr>"""
    
    soup = BeautifulSoup(page)
    rows = soup.find_all('tr', {'class': re.compile('class1.*')})
    
    for row in rows:
        cls = row.attrs.get("class")
        if not ("class2" in cls or "class3" in cls):
            print row
    

    =>

    <tr class="class1 class5">3</tr>
    <tr class="class1_a class5_a">4</tr>
    <tr class="class1 class5">5</tr>
    <tr class="class1_a class5_a">6</tr>
    

    In BS4, multi-valued attributes like class have lists of strings as their values, not strings. See http://www.crummy.com/software/BeautifulSoup/bs4/doc/#id12.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I am fetching some content from another page using jQuery .get() like this -
I'm fetching some data from an MSSQL table using the mssql_fetch_object, but the text
I'm fetching some data from FB using the following code: dynamic parameters = new
I'm fetching an a HTML page and try to get some of it's content
I have an app that is parsing the html page and extracts some text
In this script, I'm fetching a load of data from a MySQL aray, and
I have a table and from that I am fetching records somewhere around 250,000
hi everyone i am fetching some data from a database using php, encoding it
I use the Parallel::ForkManager module for fetching some pages. Below is the relevant code
I'm parsing some data using DOMDocument after fetching HTML file using curl. The codes

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.