I need to collect some statistical information in my application.
I have a table of users (tb_user)
Every time a new user accesses the application, it adds a new record in this table, ie, one line for each user. The main field are id and date_hour (timestamp for the first time user accessed the application).
tb_user
id (bigint) | date_time (timestamp with time zone)
1 | 2012-01-29 11:29:50.359-03
2 | 2012-01-31 14:27:10.359-03
I need get:
amount average users by day, week and month
Example:
by day: 55.45
by week : XX.XX
month: XX.XX
EDIT:
My best solution was:
WITH daily_count AS (SELECT COUNT(id) AS user_count FROM tb_user)
SELECT user_count, tbaux2.days, (user_count/tbaux2.days) FROM daily_count,
(SELECT EXTRACT(DAY FROM (t2.diff) ) + 1 AS days
FROM
(with tbaux AS(SELECT min(date_time) AS min FROM tb_user)
SELECT (now() - min) AS diff
FROM tbaux) AS t2) AS tbaux2
GROUP BY user_count, tbaux2.days
But this solution only worked with EXTRACT (DAY … With weeks and month did not work
Any help is welcome.
Alternatively:
SELECT user_count, tbaux2.days, (user_count/tbaux2.days) AS userPerDay, ((user_count/tbaux2.days) * 7) AS userPerWeek, ((user_count/tbaux2.days) * 30) AS userPerMonth
EDIT 2:
Based on responses from @Bruno, there are some considerations:
When I asked the question, in really I requested a way to select data by day, month and year. I believe that the search that I posted and @Bruno refined, should be interpreted as average of “a day, every 7 days and every 30 days” and not by days, weeks and months. I believe that if it is interpreted in this way, there not will be problems of gender-quoted in example (10% drop). I believe this approach of “every” is answer I need in moment, so will sign this answer.
I suggest as an improvement of post:
- Consider only closed day in result (not collect users of the current day, and not counting the current day in division)
- The result is two numeric digits.
- New research considering a data really per week and per month.
Thanks.
You should look into aggregate functions (min, max, count, avg), which go hand in hand with
GROUP BY. For date-based aggregations,date_truncis also useful.For example, this will return the number of rows per day:
You can then do the daily average using something like this (with a CTE):
Use
'week'instead of day for the weekly counts, and so on (seedate_truncdocumentation).EDIT: (Following comment: average up to and including 5/1/2012, i.e. before the 6th.)
What’s above is over-complicated, in this case. This should give you the same result:
EDIT 2: After your edit, I guess what you’re after is just a single global average for the entire period of existence of your database, rather than groups by month/week/day.
This should give you the average number of rows per day:
(I would replace
last_date_timewithNOW()to make the average over the time until now, rather than until the last visit, if there’s no recent visit.)Then, for daily, weekly, and “monthly”:
This being said, conclusions you draw from such statistics might not be great, especially if you want to see how it changes.
I would also normalise the data per day rather than assuming 30 days in a month (if not per hour, because not all days have 24 hours). Say you have 10 visits per day in Jan 2011 and 10 visits per day in Feb 2011. That gives you 310 visits in Jan and 280 visits in Feb. If you don’t pay attention, you could think you’ve had a almost a 10% drop in terms of number of visitors, so something went wrong in Feb, when really, this isn’t the case.