I have a huge sql table (more than 1 billion) of user transactions. I’d

Question

0

Asked: June 15, 20262026-06-15T05:19:26+00:00 2026-06-15T05:19:26+00:00

I have a huge sql table (more than 1 billion) of user transactions. I’d

0

I have a huge sql table (more than 1 billion) of user transactions.
I’d like to add a binary column which represents where or not the current user_id row is 40 minutes or less than the previous one.

For instance:

user_id | date                
--------+--------------------
1       | 2011-01-01 12:15:00
1       | 2011-01-01 12:00:00
8       | 2011-01-01 15:00:00
8       | 2011-01-01 14:00:00

the result of the query would be:

user_id | date                | new
--------+---------------------+----
1       | 2011-01-01 12:15:00 | 0
1       | 2011-01-01 12:00:00 | 1
8       | 2011-01-01 15:00:00 | 1
8       | 2011-01-01 14:00:00 | 1

I’d like to avoid joining the entire table to itself
and maybe use a side table or an analytic function (over-partition).

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-15T05:19:27+00:00

select user_id,
       date,
       case
          when date - lag(date) over (partition by user_id order by date) > interval '40' minute then 1
          else 0
       end as diff_flag
from the_table
order by user_id, date

It assumes that date is a timestamp column despite its name.

It’s the only way I can see. An index on (user_id, date) might speed things up – especially on 9.2 where this could qualify for an index only scan. But this is going to scan the whole table (or the maybe only the index on 9.2)

Btw: it’s not a good idea to name a column with a reserved word (date). Additionally date is a very poor name from a documentation point of view.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a huge sql table (more than 1 billion) of user transactions. I’d

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply