Hi!
I need to scan a very large table on mysql (5.1),
this is how the table looks like more or less:
CREATE TABLE `big_table` ( `id` BIGINT(11) NOT NULL AUTO_INCREMENT, `main_id` INT(11) DEFAULT NULL, `key` VARCHAR(20) NOT NULL, PRIMARY KEY (`id`), KEY `main_id_key` (`main_id`,`key`), ) ENGINE=INNODB AUTO_INCREMENT=2315678197 DEFAULT CHARSET=utf8
I need to get all unique values of main_id + keys into a new table.
Using the following query takes a lot of time (still running after 3 days on a very fast server):
CREATE TABLE `get_unique` ( `main_id` int(11) NOT NULL, `key` varchar(20) NOT NULL, PRIMARY KEY (`main_id`,`key`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; INSERT IGNORE INTO get_unique SELECT main_id,key FROM big_table
So my question is –
Will this be faster?
CREATE TABLE `get_unique` ( `main_id` int(11) NOT NULL, `key` varchar(20) NOT NULL, PRIMARY KEY (`main_id`,`key`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; INSERT INTO get_unique SELECT main_id,key FROM big_table GROUP BY 1,2
Yes
GROUP BY main_id, keywill perform many times faster as compared toINSERT IGNORE.SELECT.. GROUP BY main_id, keywould get executed faster by making use of covering index and result in a fewer number of records, whereasINSERT IGNOREwill involveINDEX KEYlook-ups for every row that is being inserted.