Very simplified, a table with some sample data:
action_date account_id
1/1/2010 123
1/1/2010 123
1/1/2010 456
1/2/2010 123
1/3/2010 789
For the data above, I need a query that will give the following:
action_date num_events num_unique_accounts num_unique_accounts_wtd
1/1/2010 3 2 2
1/2/2010 1 1 2
1/3/2010 1 1 3
As you can see here, num_unique_accounts_wtd gives a kind of rolling end date for the unique period…
At first, one would think a query of the form
WITH
events AS
(
SELECT
action_date
, COUNT(account_id) num_events
, COUNT(DISTINCT account_id) num_unique_accounts
FROM actions
GROUP BY action_date
)
SELECT
action_date
, num_events
, num_unique_accounts
, SUM(num_unique_accounts) OVER (PARTITION BY NEXT_DAY(action_date, 'Monday') - 7 ORDER BY action_date ASC) num_unique_accounts_wtd
FROM events
would work but if you look closely it just adds the num_unique_accounts each day.. if the query were to run, for clarity, for 1/2/2010 it would give num_unique_accounts_wtd = 3 because of the 2 + 1.
Any ideas?
EDIT: Added one more row of data and output for clarity
It seemed the answer may have been to be able to modify the analytical function to include something of the form
because RANGE BETWEEN allows expressions so the PARTITION BY window can be further subsetted in order to get what we’re looking for — unfortunately Oracle gives a
error so we can’t use this.
Subsequent to googling the error I found others attempting the same thing (here and here) and within the links two answers were found — one of which I used for my real-world data.
For reference, the answer for this question with the model in the original post would be something of the form:
So the data is