I’m using the following code from the bool “Hello! Python”:
import urllib2
from bs4 import BeautifulSoup
import os
def get_stock_html(ticker_name):
opener = urllib2.build_opener(urllib2.HTTPRedirectHandler(),urllib2.HTTPHandler(debuglevel=0),)
opener.addhaders = [('User-agent', "Mozilla/4.0 (compatible; MSIE 7.0; " "Windows NT 5.1; .NET CLR 2.0.50727; " ".NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)")]
url = "http://finance.yahoo.com/q?s=" + ticker_name
response = opener.open(url)
return ''.join(response.readlines())
def find_quote_section(html):
soup = BeautifulSoup(html)
# quote = soup.find('div', attrs={'class': 'yfi_rt_quote_summary_rt_top'})
quote = soup.find('div', attrs={'class': 'yfi_quote_summary'})
return quote
def parse_stock_html(html, ticker_name):
quote = find_quote_section(html)
result = {}
tick = ticker_name.lower()
result['stock_name'] = quote.find('h2').contents[0]
if __name__ == '__main__':
os.system("clear")
html = get_stock_html('GOOG')
# print find_quote_section(html)
print parse_stock_html(html, 'GOOG')
getting the following error:
Traceback (most recent call last):
File "dwlod.py", line 33, in <module>
print parse_stock_html(html, 'GOOG')
File "dwlod.py", line 25, in parse_stock_html
result['stock_name'] = quote.find('h2').contents[0]
AttributeError: 'NoneType' object has no attribute 'contents'
I’m a newbie and don’t really know what to make of it. Is the book just wrong?
ADDED
I just replaced result['stock_name'] = quote.find('h2').contents[0] with:
x = BeautifulSoup(html).find('h2').contents[0]
return x
Now, nothing gets returned, but the error no longer crops up. So, is there something wrong with the original python syntax?
While Yahoo finance hasn’t really changed their layout in a while, it seems they may have tweaked it slightly since the book was released, the info you need such as the
h2info containing the stock symbol can be found withinyfi_rt_quote_summarywhich is the container located on top ofyfi_quote_summaryAlso note that we need to return
resultif we want to print something either wiseNoneis returned:BTW note that
findsimply finds the first match.which seems to be empty,
BeautifulSoupalso hasfindallwhich returns all matches.it seems the fourth value is the one we are looking for … Still, Im sure you are not doing this, but please don’t parse the entire document every time, this can be quite expensive.