I have a simple question regarding the dichotomy of joining two tables vs using 2 separate queries.
I was hoping to find an existing question, but my search didn’t yield much (most questions were for more complex problems).
For example, consider two tables, A and X, with a very simple schema:
Table A +-------------+-------------+-------------+
| Column A (*)| Column X(FK)| Column C |
+-------------+-------------+-------------+
Table X +-------------+-------------+-------------+
| Column X (*)| Column Y | Column Z |
+-------------+-------------+-------------+
Where columns A and X are identity columns and primary keys (bigint). There is also an existing foreign key relationship for column X between tables A and X.
My question is, assuming both tables are sufficiently large (just say 500K rows), would I benefit more in terms of performance from using a single query (see Linq2Sql pseudo code below), or use two separate queries?
Option 1:
long aValue = 107;
DataContext dc = new DataContext();
var items = (from a in dc.TableA
join x in dc.TableX
on a.X equals x.X
where a.A == aValue
select new { a, x });
Option 2:
- Just assume I write an SP that does 2 separate select statements in serial.
To further quantify the problem, you can assume for every value of A, there are only a few (0-5) rows that will be joined from Table Y, so the duplication of Table A data returned in the join is not significant.
I’m asking strictly form a DB server impact standpoint. So ignoring any client-side considerations, (e.g. roundtrip networking latency, L2S query building and data marshalling costs, etc.) my questions are:
-
Which option will take less time to compute on the DB server?
-
Which option will require less memory to evaluate the result?
-
Which option is generally preferred, if there is a best practice?
Sorry if this sounds too rudimentary, but any insight will be appreciated.
Thanks,
– K.
Short answer: Trust the optimizer.
A Single query (especially with a simple join) against a well indexed table will be more effecient than writing a set of serial SQL statements. I’m not an expert in LINQ, so I’m not sure what columns you’ll be returning with your pseudocode, but if the tables are properly indexed on appropriate hardware, you’ll be fine.