Table 1: Tracks
Table 2: Wordlist
Table 3: N:M Track has Words (trackwords)
Find all tracks which have all the words.
currently the query looks like:
SELECT DISTINCT t.id FROM track as t
Left Join trackwords as tw ON t.id=tw.trackid
Left Join wordlist as wl on wl.id=tw.wordid
WHERE
wl.trackusecount>0
group by t.id
HAVING SUM(IF(wl.word IN ('folsom','prison','blues'),1,0)) = 3;
Which according to EXPLAIN is using all indexes neccessary:
+----+-------------+-------+--------+-----------------------+---------+---------+----------------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+-----------------------+---------+---------+----------------+---------+-------------+
| 1 | SIMPLE | t | index | PRIMARY | PRIMARY | 4 | NULL | 8194507 | Using index |
| 1 | SIMPLE | tw | ref | wordid,trackid | trackid | 4 | mbdb.t.id | 3 | Using where |
| 1 | SIMPLE | wl | eq_ref | PRIMARY,trackusecount | PRIMARY | 4 | mbdb.tw.wordid | 1 | Using where |
+----+-------------+-------+--------+-----------------------+---------+---------+----------------+---------+-------------+
But the query takes ages.
Any suggestion to speedup the query?
Your problem set is very much like that of storing tags for an item like StackOverflow or Del.icio.us does.
The article Tags: Database schemas proposes several solutions, among them @ChssPly76’s idea.