so basicly I have a table that consist about 700million rows, and constantly updating with about 200k-300k rows per day, and every month end, I will wipe out the data that’s more than 3 month old.
CREATE TABLE TESTRECORD (
TIMEADDED timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
SERIAL varchar(8) NOT NULL,
ENDTIME varchar(14) NOT NULL,
MODEL varchar(2) NOT NULL,
PROCESS int(4) NOT NULL,
PF varchar(4) NOT NULL,
COMID varchar(6) NOT NULL,
COMTP varchar(3) NOT NULL,
TRIAL varchar(4) NOT NULL,
TEST varchar(8) NOT NULL,
SECTION int(2) NOT NULL,
DATA_0 float NOT NULL,
DATA_1 float NOT NULL,
DATA_2 float NOT NULL,
DATA_3 float NOT NULL,
DATA_4 float NOT NULL,
DATA_5 float NOT NULL,
PRIMARY KEY (SN,ENDTIME,SECTION),
UNIQUE KEY BASESN (SN,ENDTIME,MODEL,PROCESS,PF,COMID,TRIAL,TEST,SECTION),
KEY COMID (COMID),
KEY TRIAL (TRIAL),
KEY PF (PF),
KEY TEST (TEST)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
The unique key defined the parameters that will be used in the select statement.
Since the basic function for this table is for dynamic data analyzing, so there is no specific orders what will occurs in the where clause and how many of them will be used, and there could be some random group by by one or two of the cols from the unique key as well. So it is pretty much impossible to index all the possible combo to ensure fast operation on any giving select.
To my understanding, mysql uses index based on the order they listed in the schema, so say in my case if I use SN,ENDTIME and PF in select statement, only the first 2 column will be used from the unique key. Is there any efficient way I can break the index down like 1 index per column or query technique to speed things up a bit or at least achieve a generally equal performance across the different combo of columns in the where clause?
thank you very much in advance~!!!
Indexes in MySQL work like an index you might find at the end of a book. If you are looking in a cookbook for “pepperoni pizza”, you first look up pepperoni, and then pizza. If you are looking for just “pizza” then that index is of no help to you because pizza is secondary to pepperoni in the index–you can only find pizza if you look up pepperoni first. That is how an index on columns X,Y works. If you are planning on running queries on colums X and Y in that order, than an index on the two columns together makes sense. If you want to run queries on X and queries on Y, then the compound index doesn’t serve much purpose!
I would recommend that you sit down and define what kind of queries you will run most often, and analyse your storage and processing capacity. Indexes can take up a lot of space, especially when working in the millions of rows. Indexes are a classic tradeoff between storage space and processing power, and nobody unfamiliar with your database can tell you what is the best number or configuration of indexes for your particular situation.
Look also at the number of unique values stored in each column. MySQL, unlike Oracle, does not support bitmap-style indexes for standard tables (it uses B-Tree). Technical details aside, this means that building an index on a column with a relatively small number of unique values won’t provide you with as much value per unit of index space as you may think.
One final note is that for certain types of data analysis, you may want to consider exporting some of your data to a MEMORY table. MEMORY tables are basically temporary tables which conserve their structure across user sessions. They lose their data, but not their structure when you are done using them or in the event of a crash. Memory tables support HASH indexes which hash the values of indexed columns to speed up data retrieval. They are very fast in most cases, and can improve performance dramatically when used correctly.
I would recommend you look at the book “High Performance MySQL” if you are really interested in DB optimisation.