The basic form the the query is:
EXPLAIN SELECT SUM(impressions) as impressions, SUM(clicks) as clicks, SUM(cost) as cost, SUM(conversions) as conversions, keyword_id FROM `keyword_track` WHERE user_id=1 AND campaign_id=543 AND `recorded`>1325376071 GROUP BY keyword_id
It seems that I can either index say user_id, campaign_id and keyword_id and get the GROUP BY without a file sort, although a range index on the recorded is really going to more aggressively cut down on rows, this example has a big range but other queries have a much smaller time range.
Table looks like:
CREATE TABLE IF NOT EXISTS `keyword_track` (
`track_id` int(11) NOT NULL auto_increment,
`user_id` int(11) NOT NULL,
`campaign_id` int(11) NOT NULL,
`adgroup_id` int(11) NOT NULL,
`keyword_id` int(11) NOT NULL,
`recorded` int(11) NOT NULL,
`impressions` int(11) NOT NULL,
`clicks` int(11) NOT NULL,
`cost` decimal(10,2) NOT NULL,
`conversions` int(11) NOT NULL,
`max_cpc` decimal(3,2) NOT NULL,
`quality_score` tinyint(4) NOT NULL,
`avg_position` decimal(2,1) NOT NULL,
PRIMARY KEY (`track_id`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ;
I have left any keys I currently have out of that. Basically by question is what would the best way be to get in index on the range which still indexing at least the campaign_id and ideally not needing to filesort (although that might be an acceptable tradeoff to get a range index on the recorded time).
The best index for you in this case is composite one
user_id + campaign_id + recordedThough this will not help to avoid filesort as long as you have
>comparison withrecordedandgroup byfield that isn’t included in the index at all.