I have a data table (will be millions of records but I will make it simple here) that looks like this.
ID APPROVAL_DT DAY_DT TRANS_COUNT SALE_AMOUNT
1 2010-04-22 2010-04-27 2 260
1 2010-04-22 2010-04-28 1 40
2 2010-03-28 2010-04-02 1 5
2 2010-03-28 2010-04-03 5 10
2 2010-03-28 2010-04-04 1 20
3 2010-04-25 2010-05-01 6 10
3 2010-04-25 2010-05-02 4 10
4 2010-06-01 2010-06-07 1 5
I need to figure out the DAY_DT for each ID where either the sum of all previous and current DAY_DT TRANS_COUNTs >=10 OR sum of all previous and current DAY_DT SALE_AMOUNTs >= 25
So the results of the query applied to the above table would be
ID APPROVAL_DT ACTIVATED_DT
1 2010-04-22 2010-04-27
2 2010-03-28 2010-04-04
3 2010-04-25 2010-05-02
4 2010-06-01 NULL
Any thoughts?
I assume you mean that you want to find, within an ID, the first day_dt for which the sum of previous day_dt is trans_count >= 10 or sales_amount >= 25. You call this found day the ‘activated_dt’. You description is quite different from this because it does not specifies that you want only the first day, and it asks for sum of all previous days while your example result shows the sum up to the day.
I agree with Martin here that a running total would be the best performing one, as it could produce the result in a single scan of the table.
A result w/o running totals would have to compute the previous days totals for each day_dt and then pick the the first one for each ID: