I have a MySQL Table that holds Google Analytics data:
CREATE TABLE IF NOT EXISTS `analytics_data` (
`ga_profile_id` int(11) NOT NULL,
`page` varchar(200) NOT NULL,
`source` varchar(150) NOT NULL,
`medium` varchar(50) NOT NULL,
`keyword` varchar(200) NOT NULL,
`bounces` int(11) NOT NULL,
`entrances` int(11) NOT NULL,
`exits` int(11) NOT NULL,
`new_visits` int(11) NOT NULL,
`page_views` int(11) NOT NULL,
`unique_page_views` int(11) NOT NULL,
`time_on_page` int(11) NOT NULL,
`visits` int(11) NOT NULL,
`date` date NOT NULL,
KEY `ga_profile_id` (`ga_profile_id`,`source`,`medium`,`date`),
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
I have a query to compute the sum of visitors based on a google analytics profile ID (ga_profile_id) over a given time period:
SELECT
SUM( `visits` ), ( UNIX_TIMESTAMP( `date` ) - 21600 ) * 1000 AS date
FROM `analytics_data`
WHERE
`date` >= '2011-05-09' AND `date` <= '2011-06-08' AND `ga_profile_id` = [...]
GROUP BY `date`
We have 4.5 million records or so.
Index Data:
Type: BTREE
Fields/Cardinality:
ga_profile_id / 100
source / 10196
medium / 10196
date / 149893
EXPLAIN SELECT
– id: 1
– select_type: SIMPLE
– table: analytics_data
– type: ref
– possible_keys: ga_profile_id
– key: ga_profile_id
– ref: const
– rows: 219555
– extra: Using where; Using temporary; Using filesort
Average time for execution: 1 second.
We are on a virtual private server and most queries get executed in .0003 – 0.03 seconds. LONG queries (that I was going to optimize at some point) are generally .3 seconds.
I have tried adjusting the keys, ignoring some, changing some values and nothing seems to be affecting it in a positive way. Considering this is 1 of many queries on a page.
I am looking at changing MyISAM to memory — any ideas are welcomed.
You need to create composite index
ga_profile_id + datein this particular order. And you’ll get the best you could get with such query.Further possible optimization is to pre-calculate sum of visits per date and use that for fast calculations.