Well, i have written a simple python program that parses HTML with HTMLParser. Here is my code
import re
import os.path
import getopt
import getpass
import atom
import getopt
import sys
import string
import cookielib
import ClientCookie
import urllib
import urllib2
from HTMLParser import HTMLParser
from htmlentitydefs import name2codepoint
url = 'http://distribucija.altpro.hr/cjenik_include.php'
all_data = []
def ReParse(pin):
global values
values = {'kaj' : 'sifra',
'rijec' : pin,
'prikaz' : '20' }
data = urllib.urlencode(values)
req = urllib2.Request(url, data)
response = urllib2.urlopen(req)
global the_page
the_page = response.read()
class MyHTMLParser(HTMLParser):
def handle_data(self, data):
all_data.append(data)
parser = MyHTMLParser()
ReParse('3884429')
parser.feed(the_page)
print all_data[74]
ReParse('1241236')
parser.feed(the_page)
print all_data[74]
Now, the first parser.feed(…. works and all_data[74] is right, but second feed gives exactly the same thing that the first one does, but it shouldn’t. Can anyone help me ?
You’re re-assigning
valueseach time. You want to move this:To the outside of
ReParse. And then, inside of that function, you will want to put this:You also seem to have missed
all_data. It does not exist in theMyHTMLParserscope.I do feel that it might be a good idea to warn you that “global scope” is often not your best option, but that is a separate matter.