MySQL ResultSets are by default retrieved completely from the server before any work can be done. In cases of huge result sets this becomes unusable. I would like instead to actually retrieve the rows one by one from the server.
In Java, following the instructions here (under ‘ResultSet’), I create a statement like this:
stmt = conn.createStatement(java.sql.ResultSet.TYPE_FORWARD_ONLY, java.sql.ResultSet.CONCUR_READ_ONLY); stmt.setFetchSize(Integer.MIN_VALUE);
This works nicely in Java. My question is: is there a way to do the same in python?
One thing I tried is to limit the query to a 1000 rows at a time, like this:
start_row = 0 while True: cursor = conn.cursor() cursor.execute('SELECT item FROM items LIMIT %d,1000' % start_row) rows = cursor.fetchall() if not rows: break start_row += 1000 # Do something with rows...
However, this seems to get slower the higher start_row is.
And no, using fetchone() instead of fetchall() doesn’t change anything.
Clarification:
The naive code I use to reproduce this problem looks like this:
import MySQLdb conn = MySQLdb.connect(user='user', passwd='password', db='mydb') cur = conn.cursor() print 'Executing query' cur.execute('SELECT * FROM bigtable'); print 'Starting loop' row = cur.fetchone() while row is not None: print ', '.join([str(c) for c in row]) row = cur.fetchone() cur.close() conn.close()
On a ~700,000 rows table, this code runs quickly. But on a ~9,000,000 rows table it prints ‘Executing Query’ and then hangs for a long long time. That is why it makes no difference if I use fetchone() or fetchall().
I think you have to connect passing
cursorclass = MySQLdb.cursors.SSCursor:The default cursor fetches all the data at once, even if you don’t use
fetchall.Edit:
SSCursoror any other cursor class that supports server side resultsets – check the module docs onMySQLdb.cursors.