I have the following two queries (*), which only differ in the field being restricted in the WHERE clause (name1 vs name2):
SELECT A.third_id, COUNT(DISTINCT B.fourth_id) AS num
FROM first A
JOIN second B ON A.third_id = B.third_id
WHERE A.name1 LIKE 'term%'
SELECT A.third_id, COUNT(DISTINCT B.fourth_id) AS num
FROM first A
JOIN second B ON A.third_id = B.third_id
WHERE A.name2 LIKE 'term%'
Both of the name fields have a single-column index on them. There is also an index on both third_id columns as well as fourth_id (which are all foreign keys into other tables, but it is not relevant here).
According to EXPLAIN, the first one behaves like this – which is what I want:
+----+-------------+-------+-------+---------------+----------+---------+---------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+----------+---------+---------------+------+-------------+
| 1 | SIMPLE | A | range | third_id,name | name | 767 | NULL | 3491 | Using where |
| 1 | SIMPLE | B | ref | third_id | third_id | 4 | db.A.third_id | 16 | |
+----+-------------+-------+-------+---------------+----------+---------+---------------+------+-------------+
The second one does this, which I definitely do not want:
+----+-------------+-------+------+----------------+----------+---------+---------------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+----------------+----------+---------+---------------+--------+-------------+
| 1 | SIMPLE | B | ALL | third_id | NULL | NULL | NULL | 507539 | |
| 1 | SIMPLE | A | ref | third_id,name2 | third_id | 4 | db.B.third_id | 1 | Using where |
+----+-------------+-------+------+----------------+----------+---------+---------------+--------+-------------+
What the heck is happening here? How do I make the second one behave properly (i.e. like the first one)?
(*) Actually, I don’t. I have a bit more complex queries; I have eliminated extras for this post, and distilled them to the minimal queries that still exhibit the problematic behaviour. Also, names were changed to protect the guilty.
Add CREATE TABLE statements to your post.
A real SELECT statement would be helpful too.
1 possible reason is that name2 has a much higher percentage of values starting with “term%”.
Try enforcing order of tables in query by using STRAIGHT_JOIN.