I got a weird problem of MySQL index. I have a table views_video:
CREATE TABLE `views_video` (
`video_id` smallint(5) unsigned NOT NULL,
`record_date` date NOT NULL,
`region` char(2) NOT NULL DEFAULT '',
`views` mediumint(8) unsigned NOT NULL
PRIMARY KEY (`video_id`,`record_date`,`region`),
KEY `video_id` (`video_id`)
)
The table contains 3.4 million records.
I run the EXPLAIN on this query:
SELECT video_id, views FROM views_video where video_id <= 156
I got:
+----+-------------+-------------+-------+------------------+----------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+-------+------------------+----------+---------+------+--------+-------------+
| 1 | SIMPLE | views_video | range | PRIMARY,video_id | video_id | 2 | NULL | 587984 | Using where |
+----+-------------+-------------+-------+------------------+----------+---------+------+--------+-------------+
But when I run the EXPLAIN on this query:
SELECT video_id, views FROM views_video where video_id <= 157
I got:
+----+-------------+-------------+------+------------------+------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+------+------------------+------+---------+------+---------+-------------+
| 1 | SIMPLE | views_video | ALL | PRIMARY,video_id | NULL | NULL | NULL | 3412892 | Using where |
+----+-------------+-------------+------+------------------+------+---------+------+---------+-------------+
video_id is from 1 to 1034. There is nothing special between 156 and 157.
What happens here?
* update *
I have added more data into the database. Now video_id is from 1 to 1064. And the table now has 3.8M records. And the difference become 114 and 115.
I’m guessing that with 3.4 million records, and only 1064 possible entries for your key, your selectivity is very low. (In other words, there are many duplicates, which makes it far less useful as a key.) The optimizer is taking its best guess if it is more efficient to use the key or not. You’ve found a threshold for that decision.