I have time series data that I am currently storing in a dictionary where the dictionary ‘keys’ are datetime.datetime objects. Something along the lines of:
data[datetime.datetime(2012,5,14,15,28,2)]={'error':error,'flags':flags,'value':value}
The question I have is: What is the best way to find the closest two times (before and after) a specified time? I need this function to be as fast a possible because it is called (~10,000) inside a loop that is linearly interpolating between the two closest points.
I currently have one method working which takes a ridiculously long time because it searches through all the keys (~50,000):
def findTime(time):
keys=data.keys()
bdt=10000000000000000000
adt=10000000000000000000
minKey=False
maxKey=False
for key in keys:
dt=(time-key).total_seconds()
if abs(dt)<bdt and dt>0:
bdt=abs(dt)
minKey=key
elif abs(dt)<adt and dt<0:
adt=abs(dt)
maxKey=key
return minKey,maxKey
My attempt at using bisect:
def findTime(time):
keys=data.keys()
l,r = bisect.bisect_left(time,keys), bisect.bisect_right(time,keys)
return l,r
Unfortunately, this produces an error:
TypeError: 'datetime.datetime' object does not support indexing
Any help would be appreciated.
The
bisectfunctions take as their first argument a sorted array (or list, or really, anything that can be indexed).keysis an unsorted array, and you’re passing it as the second argument.This should work:
although you should keep the sorted copy around for repeated searches that have not altered the data, rather than re-sorting every time.