I found some strange(for me) behavour in MySQL. I have a simple query:
SELECT CONVERT( `text`.`old_text`
USING utf8 ) AS stext
FROM `text`
WHERE `text`.`old_id` IN
(
SELECT `revision`.`rev_text_id`
FROM `revision`
WHERE `revision`.`rev_id`
IN
(
SELECT `page_latest`
FROM `page`
WHERE `page_id` = 108
)
)
when i run it, phpmyadmin show execution time of 77.0446 seconds.
But then i replace
WHERE `text`.`old_id` IN
by
WHERE `text`.`old_id` =
it’s execution time falls to about 0.001 sec. Result of this query
SELECT `revision`.`rev_text_id`
FROM `revision`
WHERE `revision`.`rev_id`
IN
(
SELECT `page_latest`
FROM `page`
WHERE `page_id` = 108
)
is
+------------+
|rev_text_id |
+------------+
|6506 |
+------------+
Can somebody please explain this behavour?
There are two primary ways you can increase your query performance here
For Indexes, add an index on the columns you are searching for your matches:
text.old_id, revision.rev_text_id & page.page_id
Your next issue is that nested-sub-selects are hell on your query execution plan. Here is a good thread discussing JOIN vs Subquery. Here is an article on how to get execution plan info from mySQL.
First looks at an execution plan can be confusing, but it will be your best friend when you have to concern yourself with query optimization.
Here is an example of your same query with just joins ( you could use inner or left and get pretty much the same result). I don’t have your tables or data, so forgive synax issues (there is no way I can verify the code works verbatim in your environment, but it should give you a good starting point).