I have a table like this CREATE TABLE values ( id int(10) auto_increment NOT

Question

0

Asked: June 7, 20262026-06-07T11:48:51+00:00 2026-06-07T11:48:51+00:00

I have a table like this CREATE TABLE values ( id int(10) auto_increment NOT

0

I have a table like this

CREATE TABLE values (
    id int(10) auto_increment NOT NULL, 
    molecule_id int(5) NOT NULL,
    descriptor_id int(5) NOT NULL,
    T double DEFAULT NULL,
    value double NOT NULL,
    PRIMARY KEY (id),
    KEY index1 (molecule_id, T),
    KEY index2 (descriptor_id, T)
) ENGINE=InnoDB;

Rows of the table are many combinations of 3000 descriptor_ids, 600 molecule_ids and 3500 Ts with random double values (about 2 billion rows).

I was under the impression that for a query like

SELECT T, value FROM values WHERE molecule_id = X AND descriptor_id = Y

mysql would use both keys and then intersect the results. But doing an Explain extended on this query tells me it only uses index2, having chosen between index1 and index2.

molecule_id = X hits about 1/600 of the table.
descriptor_id = Y hits either a very small part of the table of the table (like 0.001%) or about 1/700, depending on Y.

It seems like intersecting would be faster than just using index2 and scanning the rest of the over ~2.5 million rows. Even if the 3000 descriptor_ids were evenly distributed it would still leave 800,000 rows to scan on average.

What am I missing?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-07T11:48:53+00:00

spencer7593 has it right. An index_merge only occurs in range situations. If your AND were an OR it would trigger an index_merge. However, since it is an AND, why not make a multi_column index on both molecule_id and descriptor_id? That will get you better results, and faster. If descriptor_id is more exclusive (as you mentioned) do this:

ALTER TABLE values ADD INDEX descriptor_molecule (descriptor_id, molecule_id, T, value)

As long as your query has both columns in the where clause with an AND condition, it will use this index. In this case, I would actually drop your index2, since if the query only has the descriptor_id column in the where clause, it can still use the descriptor_molecule index as a prefix index. Plus, indexing all 4 columns will create a covering index for the query you mentioned and thus speed up your query by quite a bit.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a table like this CREATE TABLE values ( id int(10) auto_increment NOT

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply