I have a SELECT statement which I would like to optimize. The mysql – order by optimization says that in some cases the index cannot be used to optimize the ORDER BY. Specifically the point:
You use ORDER BY on nonconsecutive parts of a key
SELECT * FROM t1 WHERE key2=constant ORDER BY key_part2;
makes me thinking, that this could be the case. I’m using following indexes:
UNIQUE KEY `met_value_index1` (`RTU_NB`,`DATETIME`,`MP_NB`),
KEY `met_value_index` (`DATETIME`,`RTU_NB`)
With following SQL-statement:
SELECT * FROM met_value
WHERE rtu_nb=constant
AND mp_nb=constant
AND datetime BETWEEN constant AND constant
ORDER BY mp_nb, datetime
- Would it be enough delete the index
met_value_index1and create it with the new orderingRTU_NB,MP_NB,DATETIME? - Do I have to include RTU_NB into the
ORDER BYclause?
Outcome: I have tried what @meriton suggested and added the index met_value_index2. The SELECT completed after 1.2 seconds, previously it completed after 5.06 seconds. The following doesn’t belong to the question but as a side note: After some other tries I switched the engine from MyISAM to InnoDB – with rtu_nb, mp_nb, datetime as primary key – and the statement completed after 0.13 seconds!
I don’t get your query. If a row must match
mp_np = constantto be returned, all rows returned will have the samemp_nb, so includingmp_nbin the order by clause has no effect. I recommend you use the semantically equivalent statement:to avoid needlessly confusing the query optimizer.
Now, to your question: A database can implement an order by clause without sorting if it knows that the underlying access will return the rows in proper order. In the case of indexes, that means that an index can assist with sorting if the rows matched by the where clause appear in the index in the order requested by the order by clause.
That is the case here, so the database could actually do an index range scan over
met_value_index1for the rows wherertu_nb=constant AND datetime BETWEEN constant AND constant, and then check whethermp_nb=constantfor each of these rows, but that would amount to checking far more rows than necessary ifmp_nb=constanthas high selectivity. Put differently, an index is most useful if the matching rows are contiguous in the index, because that means the index range scan will only touch rows that actually need to be returned.The following index will therefore be more helpful for this query:
as all matching rows will be right next to each other in the index and the rows appear in the index in the order the
order byclause requests. I can not say whether the query optimizer is smart enough to get that, so you should check the execution plan.