I have a query that looks like this:
SELECT *
FROM A
INNER JOIN B ON A.AId = B.AId
WHERE A.ADate BETWEEN @Start and @End
or B.BDate BETWEEN @Start and @End
Both tables A and B are about the same size and have a lot of rows. Execution plan shows a index seek, but looks like it is scanning the entire index.
If I change the or to and then the query is very fast. I think this is due to the fact that the result of the or cannot be known without performing a table scan on both tables to compute the or. The and is easily split into two operations.
I have read some people stating that it is possible to use UNION in place of or, but this would potentially introduce duplicate rows in the case that both conditions in the OR are true.
What solution is there so that I can reduce the size of join and prevent a full join of both tables? I am open to restructuring the query however possible to make this work, but need the logic of the query(give me items where either the date in a matches the range or the date in B matches the range) to remain the same.
Thanks for the answers, in the end I opted for
UNION ALL, and I crafted a query based on the union of two selects that are mutually exclusive, so no duplicates would be introduced in theUNION ALL.First, get all the rows where
ADateis in the range, and exclude rows whereBDateis in the range. Then get all the rows whereBDateis in the range. The union of these two sets logically produces the set of rows that coversADateorBDate, without double counting the middle(so aUNION ALLwill not produce duplicates). Let me know if you see a flaw in this logic, I found it helpful to think of a venn diagram.This made the query perform the best of the options presented(in my case), and wasn’t overly complicated, so I went with it.
Perhaps this could be a query optimization for the
ORoperator in some scenarios, especially when querying separate, large tables, it works with date ranges, but could work with any other predicates I imagine.