Are there any good ways to objectively measure a query’s performance in Oracle 10g? There’s one particular query that I’ve been tuning for a few days. I’ve gotten a version that seems to be running faster (at least based on my initial tests), but the EXPLAIN cost is roughly the same.
- How likely is it that the EXPLAIN cost is missing something?
- Are there any particular situations where the EXPLAIN cost is disproportionately different from the query’s actual performance?
- I used the first_rows hint on this query. Does this have an impact?
Very unlikely. In fact, it would be a level
1bug 🙂Actually, if your statistics have changed significantly from the time you ran the
EXPLAIN, the actual query plan will differ. But as soom as the query is compliled, the plan will remain the same.Note
EXPLAIN PLANmay show you things that are likely to happen but may never happen in an actual query.Like, if you run an
EXPLAIN PLANon a hierarchical query:with indexes on both
idandparent, you will see an extraFULL TABLE SCANwhich most probably will not happen in real life.Use
STORED OUTLINE‘s to store and reuse the plan no matter what.Yes, it happens very very often on complicate queries.
CBO(cost based optimizer) uses calculated statistics to evaluate query time and choose optimal plan.If you have lots of
JOIN‘s, subqueries and these kinds on things in your query, its algorithm cannot predict exactly which plan will be faster, especially when you hit memory limits.Here’s the particular situation you asked about:
HASH JOIN, for instance, will need several passes over theprobe tableif the hash table will not fit intopga_aggregate_table, but as ofOracle 10g, I don’t remember this ever to be taken into account byCBO.That’s why I hint every query I expect to run for more than
2seconds in a worst case.This hint will make the optimizer to use a plan which has lower response time: it will return first rows as soon as possible, despite the overall query time being larger.
Practically, it almost always means using
NESTED LOOP‘s instead ofHASH JOIN‘s.NESTED LOOP‘s have poorer overall performance on large datasets, but they return the first rows faster (since no hash table needs to be built).As for the query from your original question, see my answer here.