I am trying to use the following query on a table with ~200k records in it. There are all sorts of other fields that can be filtered by, but this is a base example.
SELECT b.isbn FROM books b
WHERE
b.price IS NOT NULL AND
b.deleted = '' AND
b.publication_date <= '2009-12-04' AND
(
b.subject1_id IN ('CAT1','CAT2','CAT3','CAT4','CAT5') OR
b.subject2_id IN ('CAT1','CAT2','CAT3','CAT4','CAT5') OR
b.subject3_id IN ('CAT1','CAT2','CAT3','CAT4','CAT5')
)
Currently, I have a separate index on all of these fields and this query takes ~4.5 seconds, which is way too long. EXPLAIN lists NULL under key.
I also tried creating one large index that included all of the fields in the above query, but EXPLAIN shows that this multi-field index is not used.
How can I index these fields to speed up my queries?
EDIT: Here are my current indices (none of which seem to be used by the query):
- index(price)
- index(deleted)
- index(publication_date)
- index(subject1_id)
- index(subject2_id)
- index(subject3_id)
- index(price, deleted, publication_date, subject1_id, subject2_id, subject3_id)
EDIT2: Per ʞɔıu’s answer – after normalizing the tables and using basically his query, it does speed it up some (time is ~3.5 seconds now), but not as much as I’m looking for. I indexed the new table as PRIMARY KEY(isbn, subject_id) and this index is being used for the join.
EDIT3: I added an additional index on the second table (subject_id, isbn), which helps. The addition of the other index that ʞɔıu mentions below helps a small bit, but only gets used when I use “FORCE INDEX” on the query. It’s down to about 1.5 seconds now. Is there hope of getting it much lower?
You need to normalize your schema before indexing will be of any help in this case.
You can create another table that contains (subject, isbn), add indexes on book and subject, then join to that table, like:
Rule #1 (literally) in schema normalization is: “no repeating groups“.
Having the OR operation in the where clause across the 3 subject columns is going to prevent you from being able to take advantage of an index for that part of the query.
(updated to reflect that isbn is the primary key)