I’m looking at a query that outputs something like the following:
Group | Date 1 | Value 1 | Date 2 | Value 2
------+------------+---------+-------------+--------
A | 2011-06-15 | 105 | NULL | NULL
A | NULL | NULL | 2011-06-16 | 107
B | 2011-06-18 | 567 | NULL | NULL
B | NULL | NULL | 2011-06-20 | 525
What I want is to “flatten” these results by group – sort of like a COALESCE by row:
Group | Date 1 | Value 1 | Date 2 | Value 2
------+------------+---------+-------------+--------
A | 2011-06-15 | 105 | 2011-06-16 | 107
B | 2011-06-18 | 567 | 2011-06-20 | 525
Note that there will only ever be 2 rows per group; this isn’t dealing with an arbitrary number of data points, it’s essentially a “before and after” query.
Is it possible to do this in a single pass? (There is a large amount of data in the results). That means:
- No temp tables, table variables, or similar intermediate steps (CTEs are fine);
- No joining to itself, i.e. the obvious solution of getting the
MINandMAXin one query and then joining to those dates to get the values. As mentioned above, this is the result set of an expensive query, and I am trying very hard not to double the load.
I know that I could technically do this with a cursor, but I’d really really prefer not to unless somebody can prove to me that it is faster than any set-based option.
P.S. Please note that for group “B”, Value 2 is lower than Value 1. I’ve done this explicitly to demonstrate why a single GROUP BY which takes the MIN and MAX of both dates and values isn’t going to produce the expected results. The values have to be correlated to the dates.
Assuming that you always get pairs in the
Groups, you can do aGROUP BY [Group]and for each of the 4 columns select theMAX()(which will basically take the non-null value).Edit: if you have a dataset with the columns “Group”, “Date” and “Value” and want to merge those so that you have the “Group”, “Date 1”, “Value 1”, “Date 2” and “Value 2” columns, use this (the first CTE represents the test data):