I’ve been building an error logging app recently and was after a way of accurately timestamping the incoming data. When I say accurately I mean each timestamp should be accurate relative to each other (no need to sync to an atomic clock or anything like that).
I’ve been using datetime.now() as a first stab, but this isn’t perfect:
>>> for i in range(0,1000): ... datetime.datetime.now() ... datetime.datetime(2008, 10, 1, 13, 17, 27, 562000) datetime.datetime(2008, 10, 1, 13, 17, 27, 562000) datetime.datetime(2008, 10, 1, 13, 17, 27, 562000) datetime.datetime(2008, 10, 1, 13, 17, 27, 562000) datetime.datetime(2008, 10, 1, 13, 17, 27, 578000) datetime.datetime(2008, 10, 1, 13, 17, 27, 578000) datetime.datetime(2008, 10, 1, 13, 17, 27, 578000) datetime.datetime(2008, 10, 1, 13, 17, 27, 578000) datetime.datetime(2008, 10, 1, 13, 17, 27, 578000) datetime.datetime(2008, 10, 1, 13, 17, 27, 609000) datetime.datetime(2008, 10, 1, 13, 17, 27, 609000) datetime.datetime(2008, 10, 1, 13, 17, 27, 609000) etc.
The changes between clocks for the first second of samples looks like this:
uSecs difference 562000 578000 16000 609000 31000 625000 16000 640000 15000 656000 16000 687000 31000 703000 16000 718000 15000 750000 32000 765000 15000 781000 16000 796000 15000 828000 32000 843000 15000 859000 16000 890000 31000 906000 16000 921000 15000 937000 16000 968000 31000 984000 16000
So it looks like the timer data is only updated every ~15-32ms on my machine. The problem comes when we come to analyse the data because sorting by something other than the timestamp and then sorting by timestamp again can leave the data in the wrong order (chronologically). It would be nice to have the time stamps accurate to the point that any call to the time stamp generator gives a unique timestamp.
I had been considering some methods involving using a time.clock() call added to a starting datetime, but would appreciate a solution that would work accurately across threads on the same machine. Any suggestions would be very gratefully received.
You’re unlikely to get sufficiently fine-grained control that you can completely eliminate the possibility of duplicate timestamps – you’d need resolution smaller than the time it takes to generate a datetime object. There are a couple of other approaches you might take to deal with it:
Deal with it. Leave your timestamps non-unique as they are, but rely on python’s sort being stable to deal with reordering problems. Sorting on timestamp first, then something else will retain the timestamp ordering – you just have to be careful to always start from the timestamp ordered list every time, rather than doing multiple sorts on the same list.
Append your own value to enforce uniqueness. Eg. include an incrementing integer value as part of the key, or append such a value only if timestamps are different. Eg.
The following will guarantee unique timestamp values:
For multiple processes (rather than threads), it gets a bit trickier though.