We are running MySQL/ISAM database with a following table:
create table measurements (
`tm_stamp` int(11) NOT NULL DEFAULT '0',
`fk_channel` int(11) NOT NULL DEFAULT '0',
`value` int(11) DEFAULT NULL,
PRIMARY KEY (`tm_stamp`,`fk_channel`)
);
The tm_stamp–fk_channel combination is required unique, hence the compound primary key. Now, for certain irrelevant reason, the database will be migrated to InnoDB engine. Upon googling something about it, i found out that the key will dictate the physical ordering of the data on the disk. 90% of the queries currently go as follows:
SELECT value FROM measurements
WHERE fk_channel=A AND tm_stamp>=B and tm_stamp<=C
ORDER BY tm_stamp ASC
Inserts are 99% in order of tm_stamp, it’s a storage for dataloggers network. The table has low millions of rows but growing steadily. The questions are
- Should the sole change of storage engine result in any significant performance change, better or worse?
- Does the order of columns in the index matter with regards to the most popular SELECT? This blog suggest something along that line.
- Thanks to the nature of clustered index, may we perhaps leave out the ORDER BY clause and gain some performance?
Edit 1:
It appears that changing the primary key from
to
always makes sense, for both MyISAM and InnoDB. See http://sqlfiddle.com/#!2/0aa08/1 for proof this is so.
Original answer:
To determine if changing
to
would improve your query’s performance, you need to determine which field’s values cardinality is higher (which field’s values are more varied). Running
will give you the cardinality of the columns.
So, to answer your question properly we first need to know: What are the common range of values between
BandC? 60? 3,600? 86,400? more?For example, let’s say that
returns 32,768 and 256. 32,768 divided by 256 is 128. This tells us that
tm_stamphas 128 unique values for every value offk_channel.So if the difference between
BandCis usually less than 128, then leavetm_stampas the first field in the primary key. If 128 or greater, then makefk_channelthe first field.Another question: Does
fk_channelneed to be anINT(4 billion unique values, half of which are negative)? If not, then changingfk_channeltoTINYINT UNSIGNED(if you have 256 unique values), orSMALLINT UNSIGNED(65536 unique values) would save a lot of time and space.For example, let’s say you have 256 maximum possible
fk_channelvalues, and 65,536 possiblevalues, then you could change your schema via:This will store the existing data in the new table in
PRIMARY KEYorder, which will improve performance somewhat.