I have two tables that are all the same, except one has a timestamp value column and the other has a datetime value column. Indexes are the same. Values are the same.
But when I run SELECT station, MAX(timestamp) AS max_timestamp FROM stations GROUP BY station; if stations is the one with timestamps, it executes really fast, and if I try it with the datetime one, well I haven’t seen one query executes. In both cases the timestampcolumn is indexed, only the type changes.
Where should I start looking for? Or is datetime just not suitable for search and indexing ?
Here is what EXPLAIN gives :
+----+-------------+-------+-------+---------------+-------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+-------+---------+------+------+--------------------------+
| 1 | SIMPLE | stations | range | NULL | stamp | 33 | NULL | 1511 | Using index for group-by |
+----+-------------+-------+-------+---------------+-------+---------+------+------+--------------------------+
+----+-------------+--------+-------+---------------+---------+---------+------+---------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+-------+---------------+---------+---------+------+---------+-------+
| 1 | SIMPLE |stations2 | index | NULL | station | 2 | NULL | 3025467 | |
+----+-------------+--------+-------+---------------+---------+---------+------+---------+-------+
And the SHOW:
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| stations | CREATE TABLE `stations` (
`station` varchar(10) COLLATE utf8_bin DEFAULT NULL,
`available` smallint(6) DEFAULT NULL,
`timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
UNIQUE KEY `stamp` (`station`,`timestamp`),
KEY `time` (`timestamp`),
KEY `timestamp` (`timestamp`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+--------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| stations2 | CREATE TABLE `stations2` (
`station` smallint(5) unsigned NOT NULL,
`available` smallint(5) unsigned DEFAULT NULL,
`timestamp` datetime DEFAULT NULL,
KEY `station` (`station`),
KEY `timestamp` (`timestamp`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin |
+--------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
You can see from the EXPLAIN that there is no key being used for selection (NULL for possible_keys). You don’t have a WHERE clause, so this makes sense.
MySQL can utilize an index to determine MAX, and it can utilize an index to optimize GROUP BY. However, to be able to optimize both combined, you would need both the column in your MAX() function and the column in your GROUP BY clause to be in a compound index. In the first table, you have this compound index as a unique key called ‘stamp’. The EXPLAIN result shows that MySQL is using that index.
On the second table, you don’t have this compound index, so MySQL is having to perform a lot more work. It has to manually group the results and keep the MAX value for each station by manually scanning each row. If you add the same compound index on the second table, you will see similar performance between the two.
However, TIMESTAMP will still slightly outperform DATETIME because TIMESTAMP is treated as a single 4 byte integer value, which is processed faster than an 8 byte special DATETIME value. The larger the data set, the larger difference you will see.