Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 9272195
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 18, 20262026-06-18T15:48:49+00:00 2026-06-18T15:48:49+00:00

I am parsing a website using BeautifulSoup. I know that the content I want

  • 0

I am parsing a website using BeautifulSoup. I know that the content I want is in a div of class content. And that the content is all in p tags. So I ran

paragraphs= content.findAll('p')

It is fine till here. I iterate over the list, and have an if condition that’ll break out of the loop if a particular class is encountered.

for para in paragraphs:
    if 'class' in para:
        if para['class']=='end':
            break

But this isn’t working. When I run the loop it doesn’t break when the end class is encountered. In fact, while iterating over the loop, the classes of all the elements seem to get lost.

for para in paragraphs:
    if 'class' in para:
        print para['class']

This prints out nothing, even though there are elements with classes. In fact, this piece of code does print out the class –

>>>paragraphs[0]['class']
u'dateline'

But,

>>> print 'class' in paragraphs[0]
False

I don’t quiet understand what is going on here. Eventually I solved my problem by using exceptions, but this is kinda bugging me. Can anybody explain what is happening here?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-18T15:48:51+00:00Added an answer on June 18, 2026 at 3:48 pm

    When you’re putting if 'class' in para, you’re literally saying if there was the actual word class in the paragraph. I believe your intention was to see if it has a class, so what you want is:

    for para in paragraphs:
        if para.has_attr('class'):
            if para['class'][0] == 'end': # Notice that I put [0], as para['class'] is a list.
                break
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Im parsing the source of a website and Im using this regex: /page\.php\?id\=([0-9]*)\\>(.*)\<\/a\>\<\/span\>/.match(self.agent.page.content) self.agent.page.content
I am parsing an html content and have output on my screen. This website
I'm currently retrieving and parsing pages from a website using urllib2 . However, there
I am uploading a csv file and then parsing it using str_getcsv. All works
I built a website using Umbraco 4.9.0 and all is well locally. I published
I want to login to website https://ssl.aukro.ua/enter_login.php with following credential parsing and stackoverflow1 I
I have a JAVA class in which I have implemented parsing of XML using
I am parsing some data on a website using PHP and it gives me
I have been parsing this website for my windows phone app using Html agility
I was trying to crawl some of website content, using jsoup and java combination.

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.