im writing a web crawler, using wxpython to display the real-time result. Assuming that there is only one button named crawl on the window. when i clicked the button, a new dialog will come out, and the TextCtrl on the new dialog will display the current url that is crawling.
The codes can be simplified as follows(just UI with WebCrawler thread on OnDisplayClick function):
# -*- coding: utf-8 -*-
import wx
class Main ( wx.Frame ):
def __init__( self, parent ):
wx.Frame.__init__ ( self, parent, id = wx.ID_ANY, title = wx.EmptyString, pos = wx.DefaultPosition, size = wx.Size( 500,300 ), style = wx.DEFAULT_FRAME_STYLE|wx.TAB_TRAVERSAL )
self.SetSizeHintsSz( wx.DefaultSize, wx.DefaultSize )
bSizer3 = wx.BoxSizer( wx.VERTICAL )
self.Crawl = wx.Button( self, wx.ID_ANY, u"Crawl", wx.DefaultPosition, wx.DefaultSize, 0 )
self.Crawl.SetDefault()
bSizer3.Add( self.Crawl, 0, wx.ALL, 5 )
self.SetSizer( bSizer3 )
self.Layout()
self.Centre( wx.BOTH )
# Connect Events
self.Crawl.Bind( wx.EVT_BUTTON, self.OnDisplayClick )
def __del__( self ):
pass
# Virtual event handlers, overide them in your derived class
def OnDisplayClick( self, event ):
#Show the display window
newDisplay = Display(self)
newDisplay.show()
############################################################
## start a multi-threading webcrawler ##
############################################################
web_crawler = WebCrawler(newDisplay.current_url)
web_crawler.startCrawl()
class Display ( wx.Frame ):
def __init__( self, parent ):
wx.Frame.__init__ ( self, parent, id = wx.ID_ANY, title = wx.EmptyString, pos = wx.DefaultPosition, size = wx.Size( 500,300 ), style = wx.DEFAULT_FRAME_STYLE|wx.TAB_TRAVERSAL )
self.SetSizeHintsSz( wx.DefaultSize, wx.DefaultSize )
bSizer4 = wx.BoxSizer( wx.VERTICAL )
self.cur_url = wx.StaticText( self, wx.ID_ANY, u"Current_URL: ", wx.DefaultPosition, wx.DefaultSize, 0 )
self.cur_url.Wrap( -1 )
bSizer4.Add( self.cur_url, 0, wx.ALL, 5 )
self.current_url = wx.TextCtrl( self, wx.ID_ANY, wx.EmptyString, wx.DefaultPosition, wx.DefaultSize, 0 )
bSizer4.Add( self.current_url, 0, wx.ALL, 5 )
self.SetSizer( bSizer4 )
self.Layout()
self.Centre( wx.BOTH )
def __del__( self ):
pass
UI:


WebCrawler is a multi-thread crawler, i pass the TextCtrl(current_url) to the WebCrawler to let it display the current crawling url on the display window, but when i click on the crawl button, the interface seems dead, i guess it is because the multi-threading WebCrawler is running and the UI thread cannot get the opportunity to display the new window. i tried to write another two threads using threading.Thread, one is used to display the new window, one to WebCrawler, but i failed, the app often exited immediately, though it can display the window and do with the crawl threadings for some seconds, and sometimes it told me something like:
(python2.7:5231): Pango-CRITICAL **: pango_layout_get_iter: assertion `PANGO_IS_LAYOUT (layout)' failed
(python2.7:5404): GLib-GObject-CRITICAL **: g_object_ref: assertion `object->ref_count > 0' failed
the two theads are as follows:
class UpdateThread(threading.Thread):
""" WebCrawler thread """
def __init__(self, webCrawl):
threading.Thread.__init__(self)
self.webCrawl = webCrawl
def run(self):
self.webCrawl.start()
class CrawlShowThread(threading.Thread):
""" Display thread """
def __init__(self, crawl_display):
threading.Thread.__init__(self)
self.crawl_display = crawl_display
def run(self):
self.crawl_display.Show()
then both start() at the OnCrawlClick() function. but just like what i have said above, the method doesnt work.
can anyone tell me what is the right way to do with such things? any help appreciated!
You aren’t allowed to access the GUI from a non-main thread. See the documentation wiki on this.