Say that you have a sorted list of n timestamps (Python datetime objects). How would you yield a list of tuples of the form (t, count), where t is a datetime object and count is the number of elements in the list at most x minutes from t?
For example, given the dates (strings, for brevity; in reality datetime objects):
timestamps = ["13:00", "13:01", "13:03", "13:04", "13:05", "13:06", "13:09"]
if x is two minutes, then yield
[("13:00", 2), ("13:03":3), ("13:06":1), ("13:09", 1)]]
What I’m trying to do is make a coarser list of hits on a resource, and the only data I have is the access time of every hit (the granularization is to the millisecond, and I’d like it granular to the minute, or ten minutes)
I would post my attempts, but I’m ashamed…
Edit: This is what I have so far… testing to see if it works…
def group_timestamps(timestamps, chunksize=10):
"""Groups a list of timestamps in chunks of ``chunksize`` minutes"""
cs = timedelta(minutes=chunksize)
if not timestamps:
return []
t0 = timestamps[0]
count = 1
chunks = []
for ts in timestamps:
if (ts - t0) <= cs:
count += 1
else:
chunks.append((t0, count))
t0 = ts
count = 1
return chunks
This should work:
Following your example: