I realize this example is entirely contrived, but I’m looking for a general rule of thumb here; is there any difference from a performance standpoint between these two queries?
Example 1: Reference outer table:
select o.id, o.name,
(select count(*) from inner_table i1 where i1.outerid = o.id),
(select sum(i2.amount) from other_inner_table i2 where i2.outerid = o.id)
from outer_table o
where o.id = @outerid
Example 2: Compare directly against parameter:
select o.id, o.name,
(select count(*) from inner_table i1 where i1.outerid = @outerid),
(select sum(i2.amount) from other_inner_table i2 where i2.outerid = @outerid)
from outer_table o
where o.id = @outerid
I primarily use Sql Server 2008 R2, but I’d be interested in answers specific to any RDMS.
Update:
I realize this is hightly contextual; I guess I was just curious if this was only a stylistic choice or if there were circumstances where this would actual make a difference. I realize I could just “test it and see” for specific cases – but that’s not really the answer I’m after.
I created some completely empty tables, with only the required columns to allow your queries to run:
I then turned on Execution plans, and executed both queries. In both cases, this showed that all 3 tables were being scanned (as expected, with no/low rows). In particular, the scan against
inner_tablehas a predicate of:and the scan against
other_inner_tablehas a predicate of:That is, in the first example, the optimizer has identified that the outer where clauses condition of
where o.id = @outeridimplies that, within the subqueries,o.idis always equal to@outerid, and has performed that substitution.In general, unless there is a performance issue, you shouldn’t try to “help” SQL by transforming the queries by hand. There are something like 300 different optimizations that the optimizer has available to it – you might not pick the best one(s).