I am somewhat new to transactional databases and have come across an issue I am trying to understand.
I have created a simple demonstration where a database connection is stored inside each of the 5 threads created by cherrypy. I have a method that displays a table of timestamps stored in the database and a button to add a new record of time stamps.
the table has 2 fields, one for the datetime.datetime.now() timestamp passed by python and one for the database timestamp set to default NOW().
CREATE TABLE test (given_time timestamp, default_time timestamp DEFAULT NOW());
I have 2 methods that interact with the database. The first will create a new cursor, insert a new given_timestamp, commit the cursor, and return to the index page. The second method will create a new cursor, select the 10 most recent timestamps and return those to the caller.
import sys import datetime import psycopg2 import cherrypy def connect(thread_index): # Create a connection and store it in the current thread cherrypy.thread_data.db = psycopg2.connect('dbname=timestamps') # Tell CherryPy to call 'connect' for each thread, when it starts up cherrypy.engine.subscribe('start_thread', connect) class Root: @cherrypy.expose def index(self): html = [] html.append('<html><body>') html.append('<table border=1><thead>') html.append('<tr><td>Given Time</td><td>Default Time</td></tr>') html.append('</thead><tbody>') for given, default in self.get_timestamps(): html.append('<tr><td>%s<td>%s' % (given, default)) html.append('</tbody>') html.append('</table>') html.append('<form action='add_timestamp' method='post'>') html.append('<input type='submit' value='Add Timestamp'/>') html.append('</form>') html.append('</body></html>') return '\n'.join(html) @cherrypy.expose def add_timestamp(self): c = cherrypy.thread_data.db.cursor() now = datetime.datetime.now() c.execute('insert into test (given_time) values ('%s')' % now) c.connection.commit() c.close() raise cherrypy.HTTPRedirect('/') def get_timestamps(self): c = cherrypy.thread_data.db.cursor() c.execute('select * from test order by given_time desc limit 10') records = c.fetchall() c.close() return records if __name__ == '__main__': cherrypy.config.update({'server.socket_host': '0.0.0.0', 'server.socket_port': 8081, 'server.thread_pool': 5, 'tools.log_headers.on': False, }) cherrypy.quickstart(Root())
I would expect the given_time and default_time timestamps to be only a few microseconds off from each other. However I am getting some strange behavior. If I add timestamps every few seconds, the default_time is not a few microseconds off from the given_time, but is usually a few microseconds off from the previous given_time.
Given Time Default Time 2009-03-18 09:31:30.725017 2009-03-18 09:31:25.218871 2009-03-18 09:31:25.198022 2009-03-18 09:31:17.642010 2009-03-18 09:31:17.622439 2009-03-18 09:31:08.266720 2009-03-18 09:31:08.246084 2009-03-18 09:31:01.970120 2009-03-18 09:31:01.950780 2009-03-18 09:30:53.571090 2009-03-18 09:30:53.550952 2009-03-18 09:30:47.260795 2009-03-18 09:30:47.239150 2009-03-18 09:30:41.177318 2009-03-18 09:30:41.151950 2009-03-18 09:30:36.005037 2009-03-18 09:30:35.983541 2009-03-18 09:30:31.666679 2009-03-18 09:30:31.649717 2009-03-18 09:30:28.319693
Yet, if I add a new timestamp about once a minute, both the given_time and default_time are only a few microseconds off as expected. However, after submitting the 6th timestamp (the number of threads + 1) the default_time is a few microseconds off from the first given_time timestamp.
Given Time Default Time 2009-03-18 09:38:15.906788 2009-03-18 09:33:58.839075 2009-03-18 09:37:19.520227 2009-03-18 09:37:19.520293 2009-03-18 09:36:04.744987 2009-03-18 09:36:04.745039 2009-03-18 09:35:05.958962 2009-03-18 09:35:05.959053 2009-03-18 09:34:10.961227 2009-03-18 09:34:10.961298 2009-03-18 09:33:58.822138 2009-03-18 09:33:55.423485
Even though I am explicitly closing the cursor after each use, it appears that the previous cursor is still being reused. How is that possible if I am closing the cursor after I’m done with it and creating a new cursor each time? Can someone please explain what is going on here?
Closer to an answer:
I’ve added a cursor.connection.commit() to the get_timestamps method and that now gives me accurate data with the timestamps. Can anyone explain why I could need to call cursor.connection.commit() when all I am doing is a select? I am guessing that every time I get a cursor, a transaction begins (or continues with an existing transaction unit it gets committed). Is there a better way to do this or am I stuck committing every time I get a cursor regardless of what I do with that cursor?
To address the question posed by your most recent edits:
In PostgreSQL,
NOW()is not the current time, but the time at the start of the current transaction. Psycopg2 is probably starting a transaction implicitly for you, and since the transaction is never closed (by a commit or otherwise), the timestamp gets ‘stuck’ and becomes stale.Possible fixes:
statement_timestamp()(not SQL-standard-compliant, but otherwise perfect for this scenario)From the manual, section 9.9.4, emphasis added: