I have the following query:
SELECT *
from stop_times
WHERE (departure_time BETWEEN '02:41' AND '05:41'
OR departure_time BETWEEN '26:41' AND '29:41')
AND stop_times.stop_id IN(51511,51509,51508,51510,6,53851,51522,51533)
that returns 134 rows in ~800ms. If I split it:
SELECT *
from stop_times
WHERE (departure_time BETWEEN '02:41' AND '05:41'
OR departure_time BETWEEN '26:41' AND '29:41')
returns ~110k rows in ~10ms and
SELECT *
from stop_times
WHERE stop_times.stop_id IN(51511,51509,51508,51510,6,53851,51522,51533)
returns ~5k rows in ~100ms.
I tried using both a multi-column index (departure_time and stop_id) as well as 2 separate indexes, but in either case the first query can’t seem to take less than ~800ms. My stop_times table has about 3.5M rows. Is there anything I could be missing and that would significantly speed up that first query?
UPDATE 1: SHOW TABLE CREATE:
CREATE TABLE `stop_times` (
`trip_id` varchar(20) DEFAULT NULL,
`departure_time` time DEFAULT NULL,
`stop_id` varchar(20) DEFAULT NULL,
KEY `index_stop_times_on_trip_id` (`trip_id`),
KEY `index_stop_times_on_departure_time_and_stop_id` (`departure_time`,`stop_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
stop_id and trip_id being varchars instead of integers is beyond my control unfortunately…
UPDATE 2: EXPLAIN for departure_time, stop_id multi-column index:
select_type: SIMPLE
type: range
rows: 239084
EXPLAIN for stop_id, departure_time multi-column index:
select_type: SIMPLE
type: range
rows: 141
UPDATE 3: EXPLAIN for IN(51511,51509,51508,51510,6,53851,51522,51533)
select_type: SIMPLE
type: ALL
rows: 3556973 (lol)
EXPLAIN for IN("51511","51509","51508","51510","6","53851","51522","51533")
select_type: SIMPLE
type: range
rows: 141
Did you create an index
stop_id, departure_time? Becausedeparture_time, stop_idwill do absolutely nothing.This is a really hard one – it has every possible bad thing for dealing with indexes 🙁
You have a range, an OR and a non contiguous IN – it doesn’t get worse than that.
Try
stop_id, departure_timeand if it doesn’t help then there is nothing much you can do short of switching to PostgreSQL.You can also try rewriting the query as:
or: