I have two tables points and contacts and I’m trying to get the average points.score per contact grouped on a monthly basis. Note that points and contacts aren’t related, I just want the sum of points created in a month divided by the number of contacts that existed in that month.
So, I need to sum points grouped by the created_at month, and I need to take the count of contacts FOR THAT MONTH ONLY. It’s that last part that’s tricking me up. I’m not sure how I can use a column from an outer query in the subquery. I tried something like this:
SELECT SUM(score) AS points_sum,
EXTRACT(month FROM created_at) AS month,
date_trunc('MONTH', created_at) + INTERVAL '1 month' AS next_month,
(SELECT COUNT(id) FROM contacts WHERE contacts.created_at <= next_month) as contact_count
FROM points
GROUP BY month, next_month
ORDER BY month
So, I’m extracting the actual month that my points are being summed, and at the same time, getting the beginning of the next_month so that I can say “Get me the count of contacts where their created at is < next_month”
But it complains that column next_month doesn't exist This is understandable as the subquery knows nothing about the outer query. Qualifying with points.next_month doesn’t work either.
So can someone point me in the right direction of how to achieve this?
Tables:
Points
score | created_at
10 | "2011-11-15 21:44:00.363423"
11 | "2011-10-15 21:44:00.69667"
12 | "2011-09-15 21:44:00.773289"
13 | "2011-08-15 21:44:00.848838"
14 | "2011-07-15 21:44:00.924152"
Contacts
id | created_at
6 | "2011-07-15 21:43:17.534777"
5 | "2011-08-15 21:43:17.520828"
4 | "2011-09-15 21:43:17.506452"
3 | "2011-10-15 21:43:17.491848"
1 | "2011-11-15 21:42:54.759225"
sum, month and next_month (without the subselect)
sum | month | next_month
14 | 7 | "2011-08-01 00:00:00"
13 | 8 | "2011-09-01 00:00:00"
12 | 9 | "2011-10-01 00:00:00"
11 | 10 | "2011-11-01 00:00:00"
10 | 11 | "2011-12-01 00:00:00"
Edit
Now with running sum of contacts. My first draft used new contacts per month, which is obviously not what OP wants.
points. If you want all months, including missing ones inpoints, generate a list of months with generate_series() and LEFT JOIN to it.Your original form of the query could work like this:
The fix for the immediate cause of your error is to put the aggregate into a subquery. You were mixing levels in a way that is impossible.
I expect my variant to be slightly faster with big tables. Not sure about smaller tables. Would be great if you’d report back with test results.
Plus a minor fix:
<instead of<=.