I am currently working on a project, which involves altering data stored in a MYSQL database. Since the table that I am working on does not have a key, I add a key with the following command:
ALTER TABLE deCoupledData ADD COLUMN MY_KEY INT NOT NULL AUTO_INCREMENT KEY
Due to the fact that I want to group my records according to selected fields, I try to create an index for the table deCoupledData that consists of MY_KEY, along with the selected fields. For example, If I want to work with the fields STATED_F and NOT_STATED_F, I type:
ALTER TABLE deCoupledData ADD INDEX (MY_KEY, STATED_F, NOT_STATED_F)
The real issue is that the fields that I usually work with are more than 16, so MYSQL does not allow super-keys longer than 16 fields.
In conclusion, Is there another way to do this? Can I make (somehow) MYSQL to order the records according to the desired super-key (something like clustering)? I really need to make my script faster and the main overhead is that each group may contain records which are not stored on the same page of the disk, and I assume that my pc starts random I/Os in order to retrieve records.
Thank you for your time.
Nick Katsipoulakis
CREATE TABLE deCoupledData (
AA double NOT NULL DEFAULT '0',
STATED_F double DEFAULT NULL,
NOT_STATED_F double DEFAULT NULL,
MIN_VALUES varchar(128) NOT NULL DEFAULT '-1,-1',
MY_KEY int(11) NOT NULL AUTO_INCREMENT,
PRIMARY KEY (MY_KEY),
KEY AA (AA) )
ENGINE=InnoDB AUTO_INCREMENT=74358 DEFAULT CHARSET=latin1
Okay, first of all, when you add an index over multiple columns and you don’t really use the first column, the index is useless.
Example: You have a query like
and an index over (MY_KEY, STATED_F, NOT_STATED_F).
The index can only be used, if you have another
AND my_key = 1or something in the WHERE clause.Imagine you want to look up every person in a telephone book with first name ‘John’. Then the knowledge that the book is sorted by last name is useless, you still have to look up every single name.
Also, the primary key does not have to be a surrogate / artificial one. It’s nearly always better to have a primary key which is made up of columns which identify each row uniquely anyway.
Also it’s not always good to have many indexes. Not only do indexes slow down INSERTs and UPDATEs, sometimes they just cause an extra lookup, since first a look at the index is taken and a second look to find the actual data.
That’s just a few tips. Maybe Jordan’s hint is not a bad idea, “You should maybe post a new question that has your actual SQL query, table layout, and performance questions”.
UPDATE:
Yes, that is possible. According to manual
which means that the data is practically sorted on disk, yes.
Be aware that it’s also possible to define a primary key over multiple columns!
Like
as long as the combination of the columns is unique.