I have a table named visiting that looks like this:
id | visitor_id | visit_time ------------------------------------- 1 | 1 | 2009-01-06 08:45:02 2 | 1 | 2009-01-06 08:58:11 3 | 1 | 2009-01-06 09:08:23 4 | 1 | 2009-01-06 21:55:23 5 | 1 | 2009-01-06 22:03:35
I want to work out a sql that can get how many times a user visits within one session(successive visit’s interval less than 1 hour).
So, for the example data, I want to get following result:
visitor_id | count ------------------- 1 | 3 1 | 2
BTW, I use postgresql 8.3. Thanks!
UPDATE: updated the timestamps in the example data table. sorry for the confusion.
UPDATE: I don’t care much if the solution is a single sql query, using store procedure, subquery etc. I only care how to get it done 🙂
The question is slightly ambiguous because you’re making the assumption or requiring that the hours are going to start at a set point, i.e. a natural query would also indicate that there’s a result record of (1,2) for all the visits between the hour of 08:58 and 09:58. You would have to ‘tell’ your query that the start times are for some determinable reason visits 1 and 4, or you’d get the natural result set:
That extra logic is going to be expensive and too complicated for my fragile mind this morning, somebody better than me at postgres can probably solve this.
I would normally want to solve this by having a sessionkey column in the table I could cheaply group by for perforamnce reasons, but there’s also a logical problem I think. Deriving session info from timings seems dangerous to me because I don’t believe that the user will be definitely logged out after an hours activity. Most session systems work by expiring the session after a period of inactivity, i.e. it’s very likely that a visit after 9:45 is going to be in the same session because your hourly period is going to be reset at 9:08.