I’m having trouble optimizing this query: SELECT a.id FROM a JOIN b ON a.id=b.id

Question

0

Asked: May 22, 20262026-05-22T18:37:40+00:00 2026-05-22T18:37:40+00:00

I’m having trouble optimizing this query: SELECT a.id FROM a JOIN b ON a.id=b.id

0

I’m having trouble optimizing this query:

SELECT a.id
FROM a
JOIN b ON a.id=b.id
LEFT JOIN c ON a.id=c.id
WHERE
   (b.c1='12345' OR c.c1='12345')
   AND (a.c2=0 OR b.c3=1)
   AND a.c4='active'
GROUP BY a.id;

The query takes 7s, whereas it takes 0s when only one of b or c is JOINed. The EXPLAIN:

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: a
         type: ref
possible_keys: PRIMARY(id),c4,c2
          key: c4
      key_len: 1
          ref: const
         rows: 80775
        Extra: Using where; Using temporary; Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: c
         type: ref
possible_keys: id_c1_unique,id
          key: id_c1
      key_len: 4
          ref: database.a.id
         rows: 1
        Extra: Using index
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: b
         type: ref
possible_keys: id_c1_unique,id,c1,c3
          key: id
      key_len: 4
          ref: database.a.id
         rows: 2
        Extra: Using where

There is always exactly 1 matching row from b, and at most one matching row from c. It would go much faster if MySQL starting by getting the b and c rows that match the c1 literal, then join a based on id, but it starts with a instead.

Details:

MyISAM
All columns have indexes (_unique are UNIQUE)
All columns are NOT NULL

What I’ve tried:

Changing the order of the JOINs
Moving the WHERE conditions to the ON clauses
Subselects for b.c1 and c.c1 (WHERE b.id=(SELECT b.id FROM b WHERE c1=’12345′))
USE INDEX for b and c

I understand I could do this using two SELECTs with a UNION but I need to avoid that if at all possible because of how the query is being generated.

Edit: Add CREATE TABLEs

CREATE TABLEs with the relevant columns.

CREATE TABLE `a` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `c2` tinyint(1) NOT NULL,
  `c4` enum('active','pending','closed') NOT NULL,
  PRIMARY KEY (`id`),
  KEY `c2` (`c2`)
  KEY `c4` (`c4`),
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

CREATE TABLE `b` (
    `b_id` int(11) NOT NULL AUTO_INCREMENT,
    `id` int(11) NOT NULL DEFAULT '0',
    `c1` int(11) NOT NULL,
    `c3` tinyint(1) NOT NULL,
    PRIMARY KEY (`b_id`),
    UNIQUE KEY `id_c1_unique` (`id`,`c1`),
    KEY `c1` (`c1`),
    KEY `c3` (`c3`),
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

CREATE TABLE `c` (
    `c_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
    `id` int(11) NOT NULL,
    `c1` int(11) NOT NULL,
    PRIMARY KEY (`c_id`),
    UNIQUE KEY `id_c1_unique` (`id`,`c1`),
    KEY `id` (`id`),
    KEY `c1` (`c1`),
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-22T18:37:41+00:00

OP answering here.

What I’ve determined is that the behavior I’m seeing with MySQL reading the less efficient table first is an inherent issue with all LEFT JOINs where the less efficient table is on the left side. According to LEFT JOIN and RIGHT JOIN Optimization from the MySQL manual:

MySQL implements an A LEFT JOIN B join_condition as follows:

Table B is set to depend on table A and all tables on which A depends

So:

SELECT a.id
FROM a
LEFT JOIN c ON a.id=c.id
GROUP BY a.id;

will always read a first, even when the query plan shows that reading c is more efficient. Switching the tables causes MySQL to read from c first:

SELECT a.id
FROM c
LEFT JOIN a ON c.id=a.id
GROUP BY a.id;

In my case both queries return the same results. Apparently there is something conceptual that I’m missing that requires the left side table to always be read first when doing a LEFT JOIN. It seems to me the right side table could just as easily be read first and MySQL could still generate the same results (for certain queries, not necessarily for all LEFT JOINs). If that were possible though that optimization probably would have been added long ago, so I guess I’m just missing the concept.

In the end switching the order of the tables wasn’t a good solution for me. I ended up merging b and c into a single table, which simplified the application and should have been done to begin with. With a single table I can do a JOIN instead of a LEFT JOIN, avoiding the issue altogether.

Another possible solution might be creating a view that incorporates both tables, thereby giving a single view to JOIN from. I didn’t test that though.

TL;DR: Change the order of the tables to put the most efficient first (if the result set is the same regardless of the order). Or merge b and c into a single table. Or possibly create a view that combines b and c.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m having trouble optimizing this query: SELECT a.id FROM a JOIN b ON a.id=b.id

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply