I am retrieving three different sets of data (or what should be ‘unique’ rows). In total, I expect 3 different unique sets of rows because I have to complete different operations on each set of data. I am, however, retrieving more rows than there are in total in the table, meaning that I must be retrieving duplicate rows somewhere. Here is an example of my three sets of queries:
SELECT DISTINCT t1.* FROM table1 t1 INNER JOIN table2 t2 ON t2.ID = t1.ID AND t2.NAME = t1.NAME AND t2.ADDRESS <> t1.ADDRESS SELECT DISTINCT t1.* FROM table1 t1 INNER JOIN table2 t2 ON t2.ID = t1.ID AND t2.NAME <> t1.NAME AND t2.ADDRESS <> t1.ADDRESS SELECT DISTINCT t1.* FROM table1 t1 INNER JOIN table2 t2 ON t2.ID <> t1.ID AND t2.NAME = t1.NAME AND t2.ADDRESS <> t1.ADDRESS
As you can see, I am selecting (in order of queries)
- Set of data where the id AND name match
- Set of data where the id matches but the name does NOT
- Set of data where the id does not match but name DOES
I am retrieving MORE rows than exist in T1 when adding up the number of results returned from all three queries which I don’t think is logically possible, plus this means I must be duplicating rows (if it is logically possible) somewhere which prevents me from executing different commands against each set (since a row would have another command executed on it).
Can someone find where I’m going wrong here?
Consider if Name is not unique. If you have the following data:
Then Query 1 gives you:
Because rows 1 & 2 in Table 1 match rows 1 & 2, respectively in Table 2.
Query 2 gives you nothing.
Query 3 gives you
Because row 1 in Table 1 matches row 2 in Table 2 and row 2 in Table 1 matches row 1 in Table 2. Thus you get 4 rows out of Table 1 when there are only 2 rows in it.