I’ve got two queries that seem more or less identical, but they have very different performance and I can’t figure out why. Basically, I’ve got a large table (6M rows) of blogs that I need to lock and process. Originally, the query looked like this:
insert into bloglock
select b.id, [lockid], CURRENT_TIMESTAMP
from blog b
left join bloglock bl
on b.id=bl.blogid
WHERE
State=[somestate]
AND
bl.BlogId IS NULL
The semantics of this are:
Lock a blog by making an entry in the bloglock table consisting of the blog id, a lock id, and the current timestamp. But only do this if the blog state is what we want to lock, and if the blog isn’t already locked (bl.BlogId IS NULL).
This query runs very fast, around 5/100ths of a second.
But then I had to force certain blogs to be processed before others. So I added an integer priority field, an index on this field, and changed the query to:
insert into bloglock
select b.id, [lockid], CURRENT_TIMESTAMP
from blog b
left join bloglock bl
on b.id=bl.blogid
WHERE
State=[somestate]
AND
bl.BlogId IS NULL
order by Priority desc
LIMIT 100;
Same as before, except that it gets the first 100 hundred in order of priority. This query is dog slow, taking around 30 seconds to execute. I can accept that. But the puzzle is that I rewrote it to this:
insert into bloglock
select *
from
(select b.id, [lockid], CURRENT_TIMESTAMP
from blog b
left join bloglock bl
on b.id=bl.blogid
WHERE
State=[somestate]
AND
bl.BlogId IS NULL
order by Priority desc
LIMIT 100) InnerQuery;
And now it runs as fast as before. I don’t get this. It seems like the same query to me, and certainly I would think it would be the same to the optimizer, but instead there’s a 60-fold difference in performance. What’s going on here?
There is no logical reason why the third query should be faster than the second. I believe that you have probably encountered a bug in the query optimization of MySQL. You might consider filing a bug report about this for the MySQL developers.