I have a table like this (basic example, not the real thing):
FKEY | NAME | ATTRIBUTE_X
--------------------------
1 '...' 42
1 '...' 42
1 '...' 42
2 '...' 7
2 '...' 7
5 '...' 42
5 '...' 42
5 '...' 42
5 '...' 42
6 '...' 300
6 '...' 300
....
Where – normally – each of the attribute_x values for a given fkey are all the same. (In my real data, I calculate attribute_x from some columns in the table and this property needs to be the same for all rows with the same fkey.
Now I have some rows where this property does not hold. I want to search the whole table to find all FKEYs with mismatched attribute_x values.
Example:
--------------------------
145678973 '...' 23
145678973 '...' 22 // Error, should also be 23
145678973 '...' 23
My naive approach was:
SELECT distinct(TX1.FKEY)
FROM TABLEX TX1, TABLEX TX2
WHERE TX1.FKEY=TX2.FKEY
AND TX1.ATTRIBUTE_X <> TX2.ATTRIBUTE_X
;
But with my real data this doesn’t complete (I ran of of temp tablespace and after the DBA increased the temp tablespace to 20 GIG the query ran for a few hours and then bailed out.)
Generally, is there a more efficient query for this?
I have a solution with PL/SQL where I just loop over the table sorted by FKEY, and if I find a different attribute_x vs. the last fetched record where the fkey stayed the same, I have found an erroneous fkey.
But this seems oh so primitve 🙂 Is there an efficient pure SQL solution?
Simplest way: