I have a huge table: CREATE TABLE `messageline` ( `id` bigint(20) NOT NULL AUTO_INCREMENT,

Question

0

Asked: June 17, 20262026-06-17T12:38:17+00:00 2026-06-17T12:38:17+00:00

I have a huge table: CREATE TABLE `messageline` ( `id` bigint(20) NOT NULL AUTO_INCREMENT,

0

I have a huge table:

 CREATE TABLE `messageline` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `hash` bigint(20) DEFAULT NULL,
  `quoteLevel` int(11) DEFAULT NULL,
  `messageDetails_id` bigint(20) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `FK2F5B707BF7C835B8` (`messageDetails_id`),
  KEY `hash_idx` (`hash`),
  KEY `quote_level_idx` (`quoteLevel`),
  CONSTRAINT `FK2F5B707BF7C835B8` FOREIGN KEY (`messageDetails_id`) REFERENCES `messagedetails` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB AUTO_INCREMENT=401798068 DEFAULT CHARSET=utf8 COLLATE=utf8_bin

I need to find duplicate lines this way:

create table foundline AS
select ml.messagedetails_id, ml.hash, ml.quotelevel
from messageline ml,
     messageline ml1
where ml1.hash = ml.hash
  and ml1.messagedetails_id!=ml.messagedetails_id

But this request is working >1 day already. This is too long. Few hours would be ok. How can I speed this up? Thanx.

Explain:

+----+-------------+-------+------+---------------+----------+---------+---------------+-----------+-------------+
| id | select_type | table | type | possible_keys | key      | key_len | ref           | rows      | Extra       |
+----+-------------+-------+------+---------------+----------+---------+---------------+-----------+-------------+
|  1 | SIMPLE      | ml    | ALL  | hash_idx      | NULL     | NULL    | NULL          | 401798409 |             |
|  1 | SIMPLE      | ml1   | ref  | hash_idx      | hash_idx | 9       | skryb.ml.hash |         1 | Using where |
+----+-------------+-------+------+---------------+----------+---------+---------------+-----------+-------------+

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-17T12:38:18+00:00

Editorial Team

2026-06-17T12:38:18+00:00Added an answer on June 17, 2026 at 12:38 pm

You can find your duplicates like this

SELECT messagedetails_id, COUNT(*) c
FROM messageline ml
GROUP BY messagedetails_id HAVING c > 1;

If it is still too long, add a condition to split the request on an indexed field :

WHERE messagedetails_id < 100000

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a huge table: CREATE TABLE `messageline` ( `id` bigint(20) NOT NULL AUTO_INCREMENT,

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply