The statement of the problem is the following:
-One has a table INIT with structure
(number1 INT not null, number2 INT not null, ..., number7 INT not null)
-I want to insert into table ‘tab’ all rows of table INIT but I don’t want
to have 2 rows in ‘tab’ such that one is a permutation of the other. So, for example,
if (1,2,3,7,19,21,6) and (19,2,3,7,1,21,6) are rows in INIT, then one and only one
of them has to end up in ‘tab’. It doesn’t matter which of them ends up in ‘tab’.
-What my code below does is the following: I keep an auxiliary table ‘aux’ with
the same structure of INIT. I iterate over all rows of table INIT and for each row
in INIT I sort it in increasing order of its components, so if (1,2,3,7,19,21,6) is
a row in INIT, I sort it (1,2,3,6,7,19,21) and check if it is in ‘aux’. If it is
I continue to the next row. Else, I insert (1,2,3,7,19,21,6) in ‘tab’.
I ran this procedure over the table INIT that contains 300,000 rows and I estimate
that it takes over 7 hours to run. I would like to know how can I improve the
running time of this procedure.
DECLARE done BOOLEAN default 0;
DECLARE n1,n2,n3,n4,n5,n6,n7 INT;
DECLARE o1,o2,o3,o4,o5,o6,o7 INT;
DECLARE my_cursor cursor FOR select * from INIT;
DECLARE CONTINUE HANDLER FOR SQLSTATE '02000' SET done=1;
OPEN my_cursor;
drop table if exists aux;
create table aux(
number1 INT not null,
number2 INT not null,
number3 INT not null,
number4 INT not null,
number5 INT not null,
number6 INT not null,
number7 INT not null,
);
create table temp( number INT );
REPEAT
truncate table temp;
FETCH my_cursor INTO n1,n2,n3,n4,n5,n6,n7;
INSERT INTO temp values(n1);
INSERT INTO temp values(n2);
INSERT INTO temp values(n3);
INSERT INTO temp values(n4);
INSERT INTO temp values(n5);
INSERT INTO temp values(n6);
INSERT INTO temp values(n7);
BEGIN
DECLARE done2 BOOLEAN default 0;
DECLARE my_cursor2 cursor FOR select * from temp order by number;
OPEN my_cursor2;
FETCH my_cursor2 INTO o1;
FETCH my_cursor2 INTO o2;
FETCH my_cursor2 INTO o3;
FETCH my_cursor2 INTO o4;
FETCH my_cursor2 INTO o5;
FETCH my_cursor2 INTO o6;
FETCH my_cursor2 INTO o7;
IF NOT EXISTS (SELECT * FROM aux where number1=o1 AND number2=o2 AND number3=o3
AND number4=o4 AND number5 = o5 AND number6 = o6 AND number7=o7 )
THEN
INSERT INTO tab VALUES (n1,n2,n3,n4,n5,n6,n7);
END IF;
CLOSE my_cursor2;
END;
UNTIL done END REPEAT;
CLOSE my_cursor;
EDITED:
-In each row of INIT, all integers are different.
-The primary key of INIT is (number1,number2,…,number7)
You’re doing a heavy query for every row… not a good approach.
Instead, you can use some database kung fu to get the job done without a stored proc:
The key tricks involved here are:
group_concatto do the work of grouping up the numbers in a standard order so the combination can be comparedgroup_concatwith order by gives you a unique signature for the numbersgroup bywithout aggregation in mysql gives you the first row for each group-by column valueBTW, the correct term is combinations not permutations
Also, I haven’t tested this, so there could be a misplaced bracket etc, but it should “basically” work