i have a table: CREATE TABLE `p` ( `id` bigint(20) unsigned NOT NULL, `rtime`

Question

0

Asked: June 10, 20262026-06-10T08:32:31+00:00 2026-06-10T08:32:31+00:00

i have a table: CREATE TABLE `p` ( `id` bigint(20) unsigned NOT NULL, `rtime`

0

i have a table:

CREATE TABLE `p` (  
`id` bigint(20) unsigned NOT NULL,  
`rtime` datetime NOT NULL,  
`d` int(10) NOT NULL,  
`n` int(10) NOT NULL,  
PRIMARY KEY (`rtime`,`id`,`d`) USING BTREE  
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

and i have a query:

select id, d, sum(n) from p where  rtime between '2012-08-25' and date(now()) group by id, d;

i’m running explain on this query on a tiny table (2 records) and it tells me it’s going to use my PK:

id  | select_type  | table | type   | possible_keys key  | key     | key_len | ref  | rows | Extra
1   | SIMPLE       | p     | range  | PRIMARY            | PRIMARY | 8       | NULL | 1    | Using where; Using temporary; Using filesort

but when i use the same query on the same table – only this time it’s huge (350 million records) – it prefers to go through all the records and ignore my keys

id  | select_type  | table  | type | possible_keys  | key  | key_len | ref  | rows      | Extra
1   | SIMPLE       | p      | ALL  | PRIMARY        | NULL | NULL    | NULL | 355465280 | Using where; Using temporary; Using filesort

obviously, this is extremely slow..
can anyone help?

EDIT: this simple query is also taking a significant amount of time:

select count(*) from propagation_delay where  rtime > '2012-08-28';

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-10T08:32:32+00:00

Your query:

...WHERE rtime between '2012-08-25' and date(now()) group by id, d;

employs rtime, and groups by id and d. At a minimum you ought to index by rtime. You might also want to try indexing by rtime, id, d, n in this order, but when you do, you see that your index will contain more or less the same data as your table.

Probably, the optimizer does some calculations and comes to the conclusion that it’s not really worthwhile to employ the index.

I’d leave an index on rtime alone. The real clincher is how many records match the WHERE – if they’re just a few, it is convenient to read the index and hop around the table. If they’re several, maybe it’s better to sequentially scan the whole table, saving on the to-and-fro reads.

the query is getting a big chunk out of those 350 mil – i’d say a few millions

Okay, then it is likely that the cumulative cost of quickly extracting a half dozen million records from the index, and then shuttling to and fro from the main table to recover that half dozen million records, is more than the cost of opening the main table, and trawling through all 350M records grouping and summing along the way.

In such a scenario, if you always (or mostly) run aggregate queries on rtime, AND the table is an accumulating (historical) table, AND each couple (id, d) sees several scores of entries per day, you might consider creating an aggregate by date secondary table. I.e., at (say) midnight, you run a query and

INSERT INTO aggregate_table
    SELECT DATE(@yesterday) AS rtime, id, d, sum(n) AS n
    FROM main_table WHERE DATE(rtime) = @yesterday GROUP BY id, d;

The data in aggregate_table has one entry only per each couple (id, d) holding the sum on n for that day; the table is proportionately smaller, and queries faster. This assumes that you have a comparatively small number of (id, d) and each of them generates lots of rows in the main table each day.

With one logging per minute per couple, aggregation should speed up things by more than three orders of magnitude (conversely, if you have the twice-daily take of a huge number of different sensors, the benefits will be negligible).

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

i have a table: CREATE TABLE `p` ( `id` bigint(20) unsigned NOT NULL, `rtime`

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply