Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8753959
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 13, 20262026-06-13T13:31:46+00:00 2026-06-13T13:31:46+00:00

I am trying to parse some html page in python. When i reach a

  • 0

I am trying to parse some html page in python. When i reach a certain tag, i would like to start printing all the data. So far i came up with this:

class MyHTMLParser(HTMLParser):
    start = False;
    counter = 0;
    def handle_starttag(self,tag,attrs):
        if(tag == 'TBODY'):
            start = True;
            counter +=1
            #if counter == 1
    def handle_data(self,data):
        if (start == True): # this is the error line
            print data

The problem is that there is an error saying that it doesn’t know what start is. I know i could use the global, but that wouldn’t force me to define the variable outside the whole class?

EDIT:
Changing start to self.start solves the problem, but is there a way to define it inside init without messing up the HTMLParser init?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-13T13:31:47+00:00Added an answer on June 13, 2026 at 1:31 pm
    class MyHTMLParser(HTMLParser):
        start = False;
        counter = 0;
        ...
    

    This does not do what you think it does!

    In Java, C#, or similar languages, what the analogous code does is declare that the class of objects known as MyHTMLParser all have an attribute start with the initial value of False, and counter with the initial value of 0.

    In Python classes are objects too. They have their own attributes, just like every other object. So what the above does in Python is create a class object named MyHTMLParser, with an attribute start set to False and an attribute counter set to 0.1

    Another thing to keep in mind is that there is no way whatsoever to make an assignment to a bare name like start = True set an attribute on an object. It always sets a variable named start.2

    So your class contains no code that ever sets any attributes on any of your MyHTMLParser instances; the code in the class body is setting attributes on the class object itself, and the code in handle_starttag is setting local variables which are then discarded when they fall out of scope.

    Your code in handle_data is reading from a local variable named start (which you never set), for similar reasons. In Python there is no way to read an attribute without specifying in which object to look for it. A bare start is always referring to variable, either in the local function scope or some outer scope. You need self.start to read the start attribute of the self object.

    Remember, the def block defining a method is nothing special, it’s a function like any other. It’s only later, when that function happens to be stored in an attribute of a class object, that the function can be classified as a method. So the self parameter behaves the same as any other parameter, and indeed any other name. It doesn’t have to be named self (though that’s a wise convention to follow), and it has no special privileges making reads and writes of bare names look for attributes of self.

    So:

    1. Don’t define your attributes with their initial values in the class block; that’s for values which are shared by all instances of the class, not attributes of each instance. Instance attributes can only be initialised once you have a reference to the particular instance; most commonly this is done in the __init__ method, which is called as soon as the object exists.

    2. You must specify in which object you want to read or write attributes. This applies always, in every context. In particular, you will usually refer to attributes inside methods as self.attribute.

    Applying that (and eliminating the semicolons, which you don’t need in Python):

    class MyHTMLParser(HTMLParser):
        def __init__(self):
            start = False
            counter = 0
    
        def handle_starttag(self, tag, attrs):
            if(tag == 'TBODY'):
                self.start = True
                self.counter += 1
    
        def handle_data(self, data):
            if (self.start == True):
                print data
    

    1 The methods handle_starttag and handle_data are also nothing more than functions which happen to be attributes of an object that is used as a class.

    2 Usually a local variable; if you’ve declared start to be global or nonlocal then it might be an outer variable. But it’s definitely not an attribute on some object you happen to have nearby, even if that other object is bound to the name self.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I am trying to parse some text and diagram it, like you would a
I'm trying to parse some parts of an HTML page but I have problems
I'm trying to parse some information from an HTML page. The only problem is
I'm trying to parse some html in Python. There were some methods that actually
I'm trying to parse some returned html (from http://www.google.com/movies?near=37130 )to look for currently playing
I'm trying to parse some UTF-8 encoded html text that contains the left and
Trying to parse an HTML document and extract some elements (any links to text
I'm trying to parse a html doc using some code I found from this
I'm trying to parse some data returned by a 3rd party app (a TSV
I'm trying to parse some HTML with DOM in PHP, but I'm having some

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.