Imagine you have those 3 tables:

And imagine there is massive data according to this schema.
When I run a query like this:
SELECT DISTINCT tPerson.Name, tPerson.Town
FROM tPerson
JOIN tPersonTypeCodeMap ON tPersonTypeCodeMap.PersonId = tPerson.Id
JOIN tPersonHobbyCodeMap ON tPersonHobbyCodeMap.PersonId = tPerson.Id
WHERE tPersonTypeCodeMap.TypeCode IN ('C', 'S', 'P')
It works quite fast!
But when I add the second condition (NOT IN) the query takes ages:
SELECT DISTINCT tPerson.Name, tPerson.Town
FROM tPerson
JOIN tPersonTypeCodeMap ON tPersonTypeCodeMap.PersonId = tPerson.Id
JOIN tPersonHobbyCodeMap ON tPersonHobbyCodeMap.PersonId = tPerson.Id
WHERE tPersonTypeCodeMap.TypeCode IN ('C', 'S', 'P')
OR tPersonHobbCodeMap.HobbyCode NOT IN ('SKATE','CLIMBING')
Can you tell me what is the reason that slows down the query and how can I make it work faster?
In the first query, most of the filtering can be done by looking only at a single table (tpersonTypeCodeMap). In the second example, two tables need to be JOINed to perform the filtering. Also, once you introduce “OR”, you lose the effect of any index.
Is it really true that you want “OR” operating on those two filters, and not “AND”? Also, is it true that you want multiple records per person returned, depending on how many TypeCodes they match and how many HobbyCodes they fail to match?
If the condition “OR” is, in fact, what you want, you can use:
This will obtain the two sets of records independently then UNION them together. By using UNION instead of UNION ALL, a DISTINCT operation will be returned to reduce the database to unique rows.