I have a large data loaded from a pickled file. The data is a

Question

0

Asked: May 13, 20262026-05-13T21:37:39+00:00 2026-05-13T21:37:39+00:00

I have a large data loaded from a pickled file. The data is a

0

I have a large data loaded from a pickled file. The data is a sorted list of tuples containing a datetime and an int like this

[ (datetime.datetime(2010, 2, 26, 12, 8, 17), 5594813L), 
  (datetime.datetime(2010, 2, 26, 12, 7, 31), 5594810L), 
  (datetime.datetime(2010, 2, 26, 12, 6, 4) , 5594807L),
  etc
]

I want to get a population density based on some time intervals. For example, I want to grab the number of records within 5 minute / 1 minute / 30 second periods.

What is the best method to do this? I know I can just loop through every instance in the list but was looking for a better approach (if one exists).

Desired output would be something like:

2010-01-01 04:10:00  --- 5000
2010-02-04 10:05:00  --- 4000
2010-01-02 13:25:00  --- 3999

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-13T21:37:39+00:00

bisect.bisect is another way to solve this problem:

import datetime
import bisect
import collections

data=[ (datetime.datetime(2010, 2, 26, 12, 8, 17), 5594813L), 
  (datetime.datetime(2010, 2, 26, 12, 7, 31), 5594810L), 
  (datetime.datetime(2010, 2, 26, 12, 6, 4) , 5594807L),
]
interval=datetime.timedelta(minutes=1,seconds=30)
start=datetime.datetime(2010, 2, 26, 12, 6, 4)
grid=[start+n*interval for n in range(10)]
bins=collections.defaultdict(list)
for date,num in data:
    idx=bisect.bisect(grid,date)
    bins[idx].append(num)
for idx,nums in bins.iteritems():
    print('{0} --- {1}'.format(grid[idx],len(nums)))

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a large data loaded from a pickled file. The data is a

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply