I have a table with the following structure: ID, Month, Year, Value with values for one entry per id per month, most months have the same value.
I would like to create a view for that table that collapses the same values like this: ID, Start Month, End Month, Start Year, End Year, Value, with one row per ID per value.
The catch is that if a value changes and then goes back to the original, it should have two rows in the table
So:
- 100 1 2008 80
- 100 2 2008 80
- 100 3 2008 90
- 100 4 2008 80
should produce
- 100 1 2008 2 2008 80
- 100 3 2008 3 2008 90
- 100 4 2008 4 2008 80
The following query works for everything besides this special case, when the value returns to the original.
select distinct id, min(month) keep (dense_rank first order by month)
over (partition by id, value) startMonth,
max(month) keep (dense_rank first order by month desc) over (partition
by id, value) endMonth,
value
Database is Oracle
I got it to work as follows. It is heavy on analytic functions and is Oracle specific.
It might be possible to simplify some of the inner queries somewhat
The inner query checks if the month is a first/last month of the interval as follows: if the month + 1 == the next month (lag) for that grouping, then since there is a next month, this month is obviously not the end month. Otherwise, it is the last month of the interval. The same concept is used to check for the first month.
The outer query first filters out all rows that are not either start or end months (
where startMonth is not null or endMonth is not null).Then, each row is either a start month or an end month (or both), determined by whether start or end is not null). If the month is a start month, get the corresponding end month by getting the next (lead) endMonth for that id,value ordered by endMonth, and if it is an endMonth get the startMonth by looking for the previous startMonth (lag)