I have a simple table
stock_ledger_id INT(10) (Primary)
piece_to_bin_id INT(10)
quantity INT(11)
create_datetime TIMESTAMP
... and a few VARCHARs
with some simple indexes
Key_name Cardinality
PRIMARY 1510443
piece_to_bin_id 100696
This rather simple query takes about 8 seconds:
SELECT piece_to_bin_id,
SUM(quantity),
MAX(create_datetime)
FROM stock_ledger
GROUP BY piece_to_bin_id
Here’s the EXPLAIN:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE stock_ledger ALL NULL NULL NULL NULL 1512976 Using temporary; Using filesort
I found that I can bring it down to about .5 seconds by forcing an index:
SELECT piece_to_bin_id,
SUM(quantity),
MAX(create_datetime)
FROM stock_ledger
FORCE INDEX (piece_to_bin_id)
GROUP BY piece_to_bin_id
Then the EXPLAIN looks like this:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE stock_ledger index NULL piece_to_bin_id 4 NULL 1512976
I am using MySQL 5.1.41, the table is MyISAM and I did run ANALYZE TABLE before.
So am I stuck with “MySQL got it wrong again, just force the index” or is there an actual reason why MySQL uses a full table scan? Maybe one I can fix?
The query needs a full table scan anyway, it may be that mysql tries to avoid the additional transition from the key value too the row. The query might much more benefit from a composite (piece_to_ bin_id, create_datetime) index or even (piece_to_ bin_id, create_datetime, quantity). The latter would become a coverage index.
UPD
It seems the 16x faster result comes from the data distribution in your case (probably, many adjacent rows with the same
piece_to_bin_idsorted bycreate_datetime). MyISAM seems to use indexes for queries which reduce the the number of resulting rows, because using them implies random disk I/O operations.I have never drawn any attention to it, but my current tests on a table of 10K rows show that MyISAM does not even use the index for sorting a query like:
Even when the
indexed_fieldis the primary key.