Guys, I’m new at SQL and can’t figure out the “right way” to do the last part of a query. I have a table which contains a list of items and their equivalents. There are essentially twice as many rows as needed, and I’m trying to find a SQL way to select 1/2 of the entries so there are no duplicates.
Starting Table with duplicates:
Item Name EquivItem
---- ------ ----------
100 bubba 106
103 gump 109
106 shrimp 100
109 grits 103
And the resulting table would be:
Item Name EquivItem
----- ----- ----------
100 bubba 106
103 gump 109
I was using a couple nested loops in sequential code to filter out the duplicates, but finally wrote a query that works but feels like a hack.
I’m arbitrarily using a WHERE (Item < EquivItem) to select only one of the rows. The actual tables are a bit more complex and I’m afraid there may be a case where this doesn’t work.
SELECT *
FROM T
WHERE Item < EquivItem
I’m trying to take some time to figure out the right way to do things before I develop too many bad habits. Any suggestions? Thanks.
Is it possible for more than two items to be equivalent, such as 100 = 103 = 106? Can this happen?
As long as the the equivalents can’t be chained together, and always have a 1-to-1 relationship, your solution looks perfectly fine to me.
If this scenario can happen, I would first scrub the data to make sure that all the EquivItems refer to the lowest Item ID in the chain… and then your original query would still do the job.