my understanding is that HASH JOIN only makes sense when one of the 2 tables is small enough to fit into memory as a hash table.
but when I gave a query to oracle, with both tables having several hundred million rows, oracle still came up with a hash join explain plan. even when I tricked it with OPT_ESTIMATE(rows = ….) hints, it always decides to use HASH JOIN instead of merge sort join.
so I wonder how is HASH JOIN possible in the case of both tables being very large?
thanks
Yang
Hash joins obviously work best when everything can fit in memory. But that does not mean they are not still the best join method when the table can’t fit in memory. I think the only other realistic join method is a merge sort join.
If the hash table can’t fit in memory, than sorting the table for the merge sort join can’t fit in memory either. And the merge join needs to sort both tables. In my experience, hashing is always faster than sorting, for joining and for grouping.
But there are some exceptions. From the Oracle® Database Performance Tuning Guide, The Query Optimizer:
Test
Instead of creating hundreds of millions of rows, it’s easier to force Oracle to only use a very small amount of memory.
This chart shows that hash joins outperform merge joins, even when the tables are too large to fit in (artificially limited) memory:
Notes
For performance tuning it’s usually better to use bytes than number of rows. But the "real" size of the table is a difficult thing to measure, which is why the chart displays rows. The sizes go approximately from 0.375 MB up to 14 MB. To double-check that these queries are really writing to disk you can run them with /*+ gather_plan_statistics */ and then query v$sql_plan_statistics_all.
I only tested hash joins vs merge sort joins. I didn’t fully test nested loops because that join method is always incredibly slow with large amounts of data. As a sanity check, I did compare it once with the last data size, and it took at least several minutes before I killed it.
I also tested with different _area_sizes, ordered and unordered data, and different distinctness of the join column (more matches is more CPU-bound, less matches is more IO bound), and got relatively similar results.
However, the results were different when the amount of memory was ridiculously small. With only 32K sort|hash_area_size, merge sort join was significantly faster. But if you have so little memory you probably have more significant problems to worry about.
There are still many other variables to consider, such as parallelism, hardware, bloom filters, etc. People have probably written books on this subject, I haven’t tested even a small fraction of the possibilities. But hopefully this is enough to confirm the general consensus that hash joins are best for large data.
Code
Below are the scripts I used: