I have a Python datetime timestamp and a large dict (index) where keys are

Question

0

Asked: May 26, 20262026-05-26T20:32:17+00:00 2026-05-26T20:32:17+00:00

I have a Python datetime timestamp and a large dict (index) where keys are

0

I have a Python datetime timestamp and a large dict (index) where keys are timestamps and the values are some other information I’m interested in.

I need to find the datetime (the key) in index that is closest to timestamp, as efficiently as possible.

At the moment I’m doing something like:

for timestamp in timestamps:
    closestTimestamp = min(index,key=lambda datetime : abs(timestamp - datetime))

which works, but takes too long – my index dict has millions of values, and I’m doing the search thousands of times. I’m flexible with data structures and so on – the timestamps are roughly sequential, so that I’m iterating from the first to the last timestamps. Likewise the timestamps in the text file that I load into the dict are sequential.

Any ideas for optimisation would be greatly appreciated.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-26T20:32:17+00:00

Dictionaries aren’t organized for efficient near miss searches. They are designed for exact matches (using a hash table).

You may be better-off maintaining a separate, fast-searchable ordered structure.

A simple way to start off is to use the bisect module for fast O(log N) searches but slower O(n) insertions:

def nearest(ts):
    # Given a presorted list of timestamps:  s = sorted(index)
    i = bisect_left(s, ts)
    return min(s[max(0, i-1): i+2], key=lambda t: abs(ts - t))

A more sophisticated approach suitable for non-static, dynamically updated dicts, would be to use blist which employs a tree structure for fast O(log N) insertions and lookups. You only need this if the dict is going to change over time.

If you want to stay with a dictionary based approach, consider a dict-of-lists that clusters entries with nearby timestamps:

 def get_closest_stamp(ts):
      'Speed-up timestamp search by looking only at entries in the same hour'
      hour = round_to_nearest_hour(ts)
      cluster = daydict[hour]         # return a list of entries
      return min(cluster, key=lambda t: abs(ts - t))

Note, for exact results near cluster boundaries, store close-to-the-boundary timestamps in both the primary cluster and the adjacent cluster.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a Python datetime timestamp and a large dict (index) where keys are

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply