In a follow up on seperate question / answer: I’m running into the issue that from the thousands of records a correct index cant really be used.
I came up with the provided answer some time ago by myself and have it implemented for a while now. Now there are several thousand events in a database (seperate indexes on startdatetime and enddatetime columns) but the mysql interperter cant really use them because of the query itself:
SELECT * FROM table WHERE start_date <= end_of_range
AND stop_date >= start_of_range
Am i correct in thinking this cant easily be optimized further? (having to look trough 40K records just to know which events occur today (or any other range for that matter)
My question: how do the bigger applications solve this issue?
More information after the comments below:
Query:
EXPLAIN SELECT id
FROM event
WHERE startDatetime <= '2011-03-31 23:59:59'
AND endDatetime >= '2011-03-01 00:00:00'
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE event ALL startDatetime,endDatetime NULL NULL NULL 58331 Using where
In other words: the entire table? Now just to be clear: the query isnt by definition slow, but it doesnt use any index either… ?
You are probably describing a non problem.
In your test query mysql is considering to use 2 indexes (and that’s all you can ask of it): it uses none because statistics tell it that table scan will be more efficient compared to index.
I assume that in your example your test query is not selective enough to trigger the use of the indexes (your test case deals with 1 month range of data – what is the percentage of data that satisfy the condition? according to each of the indexes?).
The only thing that you can improve is to create a composite index, as I think that in your example mysql’s index merge will not be able to help you. So, do realize that it is a different situation to have
startDateTimeand on onendDateTimecompared to
(startDateTime, endDateTime)This index should be most useful for events that start within a range and apply additional criteria on
endDateTime.You might also consider having another index:
(endDateTime, startDateTime)(this one should help the most for queries that look for events that end within the range and apply additional criteria onstartDateTime).You might also read up on table scans and see how forcing an index or modifying some server side variables might effect your performance.