In python I have copied a webpage and want to get all occurrences of <a href=
I am using urllib2 and my setup is as follows:
import urllib2
response = urllib2.urlopen("http://python.org")
html = response.read()
What would be the best way to approach this task? How would I select a range of string text from a variable that has stored the entire webpage?
For parsing HTML in Python, I prefer BeautifulSoup. This is assuming you want to find links, and not just the literal
<a href=, which you can easily do searching through the string.