In general what makes an SQL query optimiser decide between a nested loop and a hash join.
Share
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
NESTED LOOPSare good if the condition inside the loop is sargable, that is index can be used to limit the number of records.For a query like this:
, with
aleading, each record fromawill be taken and all corresponding records inbshould be found.If
b.b1is indexed and has high cardinality, thenNESTED LOOPwill be a preferred way.In
SQL Server, it is also the only way to execute non-equijoins (something other than=condition in theONclause)HASH JOINis the fastest method if all (or almost all) records should be parsed.It takes all records from
b, builds a hash table over them, then takes all records fromaand uses the value of the join column as a key to look up the hash table.NESTED LOOPStakes this time:Na * (Nb / C) * R,where
NaandNbare the numbers of records inaandb,Cis the index cardinality, andRis constant time required for the row lookup (1is all fields inSELECT,WHEREandORDER BYclauses are covered by the index, about10if they are not)HASH JOINtakes this time:Na + (Nb * H), where
His sum of constants required to build and lookup the hash table (per record). They are programmed into the engine.SQL Servercomputes the cardinality using the table statistics, computes and compares the two values and chooses the best plan.