Table foobar is, for clarity, structured and has data as follows:
id, action_dt, status_id 1, '02-JUL-10', 'x' 1, '02-JUL-10', '2' 1, '02-JUL-10', NULL 2, '02-JUL-10', 'a' 2, '02-JUL-10', 'b' 3, '02-JUL-10', 'k' 3, '02-JUL-10', NULL 3, '03-JUL-10', 'k' 3, '03-JUL-10', NULL
I need a query that gets IDs such that for each ID a NULL value and a NOT NULL value exists per day. So, in the example dataset above, the query needs to return:
'02-JUL-10', 1 '02-JUL-10', 3 '03-JUL-10', 3
Yes, it can be done using something like:
SELECT
nulls.action_dt
, nulls.id
FROM (SELECT
action_dt
, id
FROM foobar
WHERE status_id IS NULL
GROUP BY action_dt) nulls
INNER JOIN (SELECT
action_dt
, id
FROM foobar
WHERE status_id IS NOT NULL
GROUP BY action_dt) non_nulls ON nulls.action_dt = non_nulls.action_dt
AND nulls.id = non_nulls.id
but as you can see, among other things, two subqueries and another iteration for the join…
The query I’ve been working on and have hopes for is of the form:
SELECT
action_dt
, id
FROM
foobar
GROUP BY
action_dt
, id
, CASE WHEN status_id IS NOT NULL THEN 1 ELSE 0 END
HAVING
COUNT(prim_card_nb) > 1
but it doesn’t quite return what I need (as you know, the HAVING clause applies to the underlying data that is being queried). Any ideas?
After all this, it seems a solution would be to have the above query in a subquery and filter it down that way, such as:
SELECT
action_dt
, id
FROM (SELECT
action_dt
, id
FROM
foobar
GROUP BY
action_dt
, id
, CASE WHEN status_id IS NOT NULL THEN 1 ELSE 0 END
) repeat_ids_per_day
GROUP BY
action_dt
, id
HAVING
COUNT(id) > 1
but I feel it can be better…
Your idea is sound: in such a case you don’t need a subquery, an aggregate is sufficient and should be more efficient. This should work: