I have a 3 large tables (10k, 10k, and 100M rows) and am trying to do a simple count on a join of them, where all the joined columns are indexed. Why does the COUNT(*) take so long, and how can I speed it up (without triggers and a running summary)?
mysql> describe SELECT COUNT(*) FROM `metaward_alias` INNER JOIN `metaward_achiever` ON (`metaward_alias`.`id` = `metaward_achiever`.`alias_id`) INNER JOIN `metaward_award` ON (`metaward_achiever`.`award_id` = `metaward_award`.`id`) WHERE `metaward_award`.`owner_id` = 8;
+----+-------------+-------------------+--------+-------------------------------------------------------+----------------------------+---------+---------------------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------------+--------+-------------------------------------------------------+----------------------------+---------+---------------------------------+------+-------------+
| 1 | SIMPLE | metaward_award | ref | PRIMARY,metaward_award_owner_id | metaward_award_owner_id | 4 | const | 1552 | |
| 1 | SIMPLE | metaward_achiever | ref | metaward_achiever_award_id,metaward_achiever_alias_id | metaward_achiever_award_id | 4 | paul.metaward_award.id | 2498 | |
| 1 | SIMPLE | metaward_alias | eq_ref | PRIMARY | PRIMARY | 4 | paul.metaward_achiever.alias_id | 1 | Using index |
+----+-------------+-------------------+--------+-------------------------------------------------------+----------------------------+---------+---------------------------------+------+-------------+
3 rows in set (0.00 sec)
But actually running the query takes about 10 minutes, and I’m on MyISAM so the tables are fully locked down for that duration
I guess the reason is that you do a huge join over three tables (without applying where clause first, the result would be 10k * 10k * 100M = 1016 rows). Try to reorder joins (for example start with
metaward_award, then join onlymetaward_achieversee how long that takes, then try to plugmetaward_alias, possibly using subquery to force your preferred evaluation order).If that does not help you might have to denormalize your data, for example by storing number of aliases for particular
metaward_achiever. Then you’d get rid of one join altogether. Maybe you can even cache the sums formetaward_award, depending on how and how often is your data updated.Other thing that might help is getting all your database content into RAM 🙂