I have the following SQL query:
SELECT DISTINCT business_key
FROM Memory
WHERE concept <> 'case' OR attrib <> 'status' OR value <> 'closed'
What I try to achieve is to get all unique business keys that don’t have a record concept=case AND attrib=status AND value=closed. Running this query in MySQL with 500 000 records with all unique business_keys is very slow: about 11 seconds.
I placed indices to the business_key column, to the concept, attrib and value columns. I also tried with a combined index to all three columns (concept, attrib, value) but the result is the same.
Here is a screenshot of the EXPLAIN EXTENDED command:

The interesting thing is that running the query without the distinct specifier results in a very fast execution.
I had also tried this:
SELECT DISTINCT m.business_key
FROM Memory m
WHERE m.business_key NOT IN
(SELECT c.business_Key
FROM Memory c
WHERE c.concept = 'case' AND c.attrib = 'status' AND c.value = 'closed')
with even worse results: around 25 seconds
You could add a compound
(concept, attrib, value, business_key)index so the query (if MySQL decides to use this index) can find all the info in the index without having to read the whole table.Your query is equivalent to:
and to this (which will probably yield the same execution plan):
Since the 4 columns that are to be put in the index are all
VARCHAR(255), the index length will be pretty large. MyISAM will not allow more than 1000 bytes and InnoDB no more than 3072.One solution is to cut the length of the last part, making the index length less than 1000:
255+255+255+230 = 995:(concept, attrib, value, business_key(220))It will work but it’s really not good to have so large index lengths, performance wise.
Another option is to lower the length of all or some of those 4 columns, if that complies with the data you expect to store there. No need to declare length
255if you expect to have maximum of100in a column.Another option you may consider is putting those 4 columns in 4 separate reference tables. (Or just the columns that have repeated data. It seems that
business_keywill have duplicate data but not that many. So, it won’t be much good to make a reference table for that column.)Example: Put
conceptvalues in a new table with something like:and then change the
Memorytable with:do this (once):
and then drop the
Memory.conceptcolumn:You can also add
FOREIGN KEYreferences if you change your tables from MyISAM to InnoDB.After doing the same for all 4 columns, not only the length of new compound index in the
Memorytable will be much smaller but your table size will be much smaller, too. Additionally, any other index that uses any of those columns will have smaller length.Off course the query would need 4 JOINs to be written. And any
INSERT,UPDATEorDELETEstatement to this table will have to be changed and carefully designed.But overall, I think you will have better performance. With the design you have now, it seems that values like
'case','status'and'closed'are repeated many times.