I am trying to write a threaded Python script which will iterate through a list of urls and open each one in a separate thread.
from BeautifulSoup import BeautifulSoup
from threading import Thread
import mechanize
tickers = ["aapl", "siri", "goog", "intc"]
nextTicker = 0
def quotes(i):
br = mechanize.Browser()
br.addheaders = [('User-agent', 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.10) Gecko/20100914 Firefox/3.6.10')]
r= br.open('http://finance.yahoo.com/q?s=' + tickers[nextTicker])
html = r.read()
soup = BeautifulSoup(html)
price = soup.findAll('span', attrs={"id":"yfs_l10_" + tickers[nextTicker]})
price = price[0].string
print price
for i in range(4):
t = Thread(target=quotes, args=(i,))
t.start()
I know that I need a nextTicker = nextTicker + 1 in there so that each thread will grab a unique ticker symbol from the list named tickers but I am not sure where to put this or how to ensure that each thread is getting a unique url.
Right now when the script runs it just grabs the index 0 item from the list for all four threads. How do I get each thread to grab the next item in the list and append it to my base url?
If you want thread specific data, pass it in the arguments.
So use tickers[i] instead of tickers[nextTicker]
Better yet, use
Possibly better yet, checkout eventlet. It allows writing code like this but avoids some of the problems with threads.