Overview:
I have three tables 1) subscribers, bios, and shirtsizes and i need to find the subscribers without a bio or shirtsizes
the tables are laid out such as
subscribers
| season_id | user_id |
bio
| bio_id | user_id |
shirt sizes
| bio_id | shirtsize |
And I need to find all users who do not have a bio or shirtsize, (if no bio; then no shirtsize via relation) for any given season.
I originally wrote a query like:
SELECT * FROM subscribers s LEFT JOIN bio b ON b.user_id = subscribers.user_id LEFT JOIN shirtsizes ON shirtsize.bio_id = bio.bio_id WHERE s.season_id = 185181 AND (bio.bio_id IS NULL OR shirtsize.size IS NULL);
but it is taking 10 seconds to complete now.
I am wondering how I can restructure the query (or possibly the problem) so that it will preform reasonably.
Here is the mysql explain: (ogu = subscribers, b = bio, tn = shirtshize)
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+---------------+---------+---------+-------------+--------+-------------+ | 1 | SIMPLE | ogu | ref | PRIMARY | PRIMARY | 4 | const | 133 | Using where | | 1 | SIMPLE | b | index | NULL | PRIMARY | 8 | NULL | 187644 | Using index | | 1 | SIMPLE | tn | ref | nid | nid | 4 | waka2.b.nid | 1 | Using where |
The above is pretty sanitized, here’s the realz info:
mysql> DESCRIBE subscribers +-----------+---------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-----------+---------+------+-----+---------+-------+ | subscribers | int(11) | NO | PRI | | | | uid | int(11) | NO | PRI | | | mysql> DESCRIBE bio; +-------+------------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+------------------+------+-----+---------+-------+ | bio_id | int(10) unsigned | NO | PRI | 0 | | | uid | int(10) unsigned | NO | PRI | 0 | | mysql> DESCRIBE shirtsize; +-------+------------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+------------------+------+-----+---------+-------+ | bio_id | int(10) unsigned | NO | PRI | 0 | | | shirtsize | int(10) unsigned | NO | PRI | 0 | |
and the real query looks like:
SELECT ogu.nid, ogu.is_active, ogu.uid, b.nid AS bio_node, tn.nid AS size FROM og_uid ogu LEFT JOIN bio b ON b.uid = ogu.uid LEFT JOIN term_node tn ON tn.nid = b.nid WHERE ogu.nid = 185033 AND ogu.is_admin = 0 AND (b.nid IS NULL OR tn.tid IS NULL)
nid is season_id or bio_id (with a type); term_node is going to be the shirtsize
The query should be OK. I would run it through a query analyzer and refine the indexes on the tables.