Given i have articles which are tagged, one article can have n tags.
Currently there are around 250k tag-entries which all point back
onto their belonging article.
Now i want to find all tags from articles matching certain criteria.
I came up with two different approaches. Both which have drawbacks and are slow.
May be someone can point me in the right direction on how to speed them up or even come
up with a better solution.
Keys (ind,rindex) are varchar(255) unfortunately this cant be changed
Query #1
taking 7.5 – subselect returns 60 records in 50ms
SELECT count(*) AS tagscount, tags.value FROM tags
WHERE tags.`rindex` IN
(
SELECT article.ind
FROM article
INNER JOIN struktur ON (struktur.ind = article.struktur)
WHERE article.date = '2011-12-21'
)
AND tags.`rtable` = 'article'
GROUP BY tags.value ORDER BY tagscount DESC LIMIT 20
Query #2
taking 60ms
SELECT count(*) AS tagscount, tags.value FROM tags
INNER JOIN article ON (article.ind = tags.rindex AND tags.rtable = 'article')
LEFT JOIN structure ON (article.structure = structure.ind)
WHERE article.date = '2011-12-21'
GROUP BY tags.value ORDER BY tagscount DESC LIMIT 20
The Strange Part – Important
When i change article.date = '2011-12-21' into article.date >= '2009-12-21'
Query #1
- taking 10.1s – subselect returns 18k rows in 70ms
Query #2
- taking 14.2s
If you need further information i’ll be happy to provide
SCHEMAS
mysql> SHOW COLUMNS FROM tags;
+---------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------+--------------+------+-----+---------+-------+
| ind | varchar(255) | NO | PRI | | |
| rtable | varchar(255) | NO | MUL | | |
| rindex | varchar(255) | NO | MUL | | |
| value | varchar(40) | YES | MUL | NULL | |
+---------+--------------+------+-----+---------+-------+
mysql> SHOW indexes FROM tags
+-------+------------+---------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+---------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| tags | 0 | tags_ind | 1 | ind | A | 275834 | NULL | NULL | | BTREE | |
| tags | 1 | tags_tag | 1 | tag | A | 27583 | NULL | NULL | YES | BTREE | |
| tags | 1 | tags_rindex | 1 | rindex | A | 55166 | NULL | NULL | | BTREE | |
| tags | 1 | tags_rindex_tabelle | 1 | tabelle | A | 4 | 30 | NULL | | BTREE | |
| tags | 1 | tags_rindex_tabelle | 2 | rindex | A | 55166 | 50 | NULL | | BTREE | |
+-------+------------+---------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
mysql> SHOW COLUMNS FROM structure;
+------------------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------------------+--------------+------+-----+---------+-------+
| ind | varchar(255) | NO | PRI | | |
+------------------------+--------------+------+-----+---------+-------+
mysql> SHOW COLUMNS FROM artikel;
+--------------------+--------------+------+-----+------------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------------+--------------+------+-----+------------+-------+
| ind | varchar(255) | NO | PRI | | |
| date | date | NO | MUL | 0000-00-00 | |
+--------------------+--------------+------+-----+------------+-------+
EXPLAIN
mysql> explain #1
+----+--------------------+----------+--------+-------------------------------------------------------------------------------------+---------------------+---------+---------------------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+----------+--------+-------------------------------------------------------------------------------------+---------------------+---------+---------------------+--------+----------------------------------------------+
| 1 | PRIMARY | tags | ref | tags_rindex_tabelle | tags_rindex_tabelle | 32 | const | 177175 | Using where; Using temporary; Using filesort |
| 2 | DEPENDENT SUBQUERY | artikel | eq_ref | artikel_ind,zeitraum_start_i,freigabe_i,korrektur_i,struktur_i,artikel_start_slot_i | artikel_ind | 257 | func | 1 | Using where |
| 2 | DEPENDENT SUBQUERY | struktur | eq_ref | struktur_ind,struktur_host | struktur_ind | 257 | ec.artikel.struktur | 1 | Using where |
+----+--------------------+----------+--------+-------------------------------------------------------------------------------------+---------------------+---------+---------------------+--------+----------------------------------------------+
mysql> explain #2
+----+-------------+----------+--------+-------------------------------------------------------------------------------------+---------------------+---------+---------------------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+--------+-------------------------------------------------------------------------------------+---------------------+---------+---------------------+--------+----------------------------------------------+
| 1 | SIMPLE | tags | ref | tags_rindex,tags_rindex_tabelle | tags_rindex_tabelle | 32 | const | 177175 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | artikel | eq_ref | artikel_ind,zeitraum_start_i,freigabe_i,korrektur_i,struktur_i,artikel_start_slot_i | artikel_ind | 257 | ec.tags.rindex | 1 | Using where |
| 1 | SIMPLE | struktur | eq_ref | struktur_ind,struktur_host | struktur_ind | 257 | ec.artikel.struktur | 1 | Using where |
+----+-------------+----------+--------+-------------------------------------------------------------------------------------+---------------------+---------+---------------------+--------+----------------------------------------------+
I assume
artikel.indis not restricted to being ascending lexical order in the same order asartikel.date. If it is, the obvious solution would be to add a restriction to therindexwhich corresponded to the date range.As it is, it looks like an appropriate plan is being used.
Your best bet without changing data types would be to create a materialized view indexed on
(artikel.date, tags.value, artikel.ind)and then query that.