I have data in a 3d dictionary as:
movieid, date,customer_id,views
0, (2011,12,22), 0, 22
0, (2011,12,22), 1, 2
0, (2011,12,22), 2, 12
.....
0, (2011,12,22), 7, 2
0, (2011,12,23), 0, 123
..
so basically the data represents how many times a movie has been watched each day.. by each customer (there are just 8 customers)..
Now, I want to calculate.. on average how many times a movie has been watched by each customer.
So basically
movie_id,customer_id, avg_views
0, 0, 33.2
0, 1 , 22.3
and so on
What is the pythonic way to solve this.
Thakns
Edit:
data = defaultdict(lambda : defaultdict(dict))
date = datetime.datetime(2011,1,22)
data[0][date][0] = 22
print data
defaultdict(<function <lambda> at 0x00000000022F7CF8>,
{0: defaultdict(<type 'dict'>,
{datetime.datetime(2011, 1, 22, 0, 0): {0: 22}}))
Suppose there are just 2 customers, 1 movie and 2 days worth of data
movie_id, date, customer_id,views
0 , 2011,1,22,0,22
0 , 2011,1,22,1,23
0 , 2011,1,23,0,44
note: The customer 1 didnt watched a movie id 0 on 23rd jan
Now the answer would
movie_id,customer_id,avg_views
0 , 0 , (22+44)/2
0, 1, (23)/1
summakes this easy. In my original version I useddict.keys()a lot, but iterating over a dictionary gives you the keys by default.This function calculates a single line of the result:
Then you can just loop it to get whatever form you want. Maybe: