I’m trying to save a websites source to my MySQL database. The source is successful retrieved using urllib. Next the data is saved. Connections with the db are fine, the problem lies with the saving of the source because when i remove source from the insert statement all is fine.
# get the webpage source
f = urllib.urlopen(row_urls['url'])
source_fetched = f.read()
f.close()
# Save the webpage source
scrapy_url_id = row_urls['id']
url = row_urls['url']
created = datetime.datetime.now()
source = unicode(source_fetched,'utf-8')
cur_webpage_save = con.cursor(mdb.cursors.DictCursor)
cur_webpage_save.execute("""INSERT INTO webpage(scrapy_url_id,url,created,source) VALUES('%s', '%s', '%s', '%s');""" %(scrapy_url_id, url, created, source))
I guess it got something todo with characters that needs to be escaped, i tried this but it generates the same error:
cur_webpage_save.execute(mdb.escape_string("""INSERT INTO webpage(scrapy_url_id,url,created,source) VALUES('%s', '%s', '%s', '%s');""" %(scrapy_url_id, url, created, source)))
Below you see the error. What am I doing wrong…
Traceback (most recent call last):
File "clean.py", line 55, in <module>
cur_webpage_save.execute(mdb.escape_string("""INSERT INTO webpage(scrapy_url_id,url,created,source) VALUES('%s', '%s', '%s', '%s');""" %(scrapy_url_id, url, created, source)))
File "/usr/lib/python2.7/dist-packages/MySQLdb/cursors.py", line 174, in execute
self.errorhandler(self, exc, value)
File "/usr/lib/python2.7/dist-packages/MySQLdb/connections.py", line 36, in defaulterrorhandler
raise errorclass, errorvalue
_mysql_exceptions.ProgrammingError: (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '\\'4\\', \\'http://example.com/test.html?id=108185\\', \\'2012-10-28' at line 1")
Do not use string formatting to put data into the database, use SQL parameters instead:
For MySQLdb the syntax is almost the same, just remove the single quotes.
Each
%sis replaced by a properly quoted string value, by the database adapter. This also prevents SQL injection attacks.