When I run a loop over a bunch of URLs to find all links (in certain Divs) on those pages I get back this error:
Traceback (most recent call last):
File "file_location", line 38, in <module>
out.writerow(tag['href'])
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2026' in position 0: ordinal not in range(128)
The code I have written related to this error is:
out = csv.writer(open("file_location", "ab"), delimiter=";")
for tag in soup_3.findAll('a', href=True):
out.writerow(tag['href'])
Is there a way to solve this, possibly using an if statement to ignore any URLs that have Unicode errors?
Thanks in advance for your help.
You can wrap the writerow method call in a
tryand catch the exception to ignore it:but you almost certainly want to pick an encoding other than ASCII for your CSV file (utf-8 unless you have a very good reason to use something else), and open it with
codecs.open()instead of the built-inopen.