I have a large table in MySQL (running within MAMP) it has 28 million rows and its 3.1GB in size. Here is its structure
CREATE TABLE `termusage` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`termid` bigint(20) DEFAULT NULL,
`date` datetime DEFAULT NULL,
`dest` varchar(255) DEFAULT NULL,
`cost_type` tinyint(4) DEFAULT NULL,
`cost` decimal(10,3) DEFAULT NULL,
`gprsup` bigint(20) DEFAULT NULL,
`gprsdown` bigint(20) DEFAULT NULL,
`duration` time DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `termid_idx` (`termid`),
KEY `date_idx` (`date`),
KEY `cost_type_idx` (`cost_type`),
CONSTRAINT `termusage_cost_type_cost_type_cost_code` FOREIGN KEY (`cost_type`) REFERENCES `cost_type` (`cost_code`),
CONSTRAINT `termusage_termid_terminal_id` FOREIGN KEY (`termid`) REFERENCES `terminal` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=28680315 DEFAULT CHARSET=latin1
Here is the output from SHOW TABLE STATUS :
Name,Engine,Version,Row_format,Rows,Avg_row_length,Data_length,Max_data_length,Index_length,Data_free,Auto_increment,Create_time,Update_time,Check_time,Collation,Checksum,Create_options,Comment
'termusage', 'InnoDB', '10', 'Compact', '29656469', '87', '2605711360', '0', '2156920832', '545259520', '28680315', '2011-08-16 15:16:08', NULL, NULL, 'latin1_swedish_ci', NULL, '', ''
Im trying to run the following select statement :
select u.id from termusage u
where u.date between '2010-11-01' and '2010-12-01'
it takes 35 minutes to return to result (approx 14 million rows) – this is using MySQL Worksbench.
I have the following MySQL config setup :
Variable_name Value
bulk_insert_buffer_size 8388608
innodb_buffer_pool_instances 1
innodb_buffer_pool_size 3221225472
innodb_change_buffering all
innodb_log_buffer_size 8388608
join_buffer_size 131072
key_buffer_size 8388608
myisam_sort_buffer_size 8388608
net_buffer_length 16384
preload_buffer_size 32768
read_buffer_size 131072
read_rnd_buffer_size 262144
sort_buffer_size 2097152
sql_buffer_result OFF
Eventually im trying to run a larger query – that joins a couple of tables and groups some data, all based on the variable – customer id –
select c.id,u.termid,u.cost_type,count(*) as count,sum(u.cost) as cost,(sum(u.gprsup) + sum(u.gprsdown)) as gprsuse,sum(time_to_sec(u.duration)) as duration
from customer c
inner join terminal t
on (c.id = t.customer)
inner join termusage u
on (t.id = u.termid)
where c.id = 1 and u.date between '2011-03-01' and '2011-04-01' group by c.id,u.termid,u.cost_type
This returns a maximum of 8 rows (as there are only 8 separate cost_types – but this query runs OK where there are not many (less than 1 million) rows in the termusage table to calculate – but takes forever when the number of rows in the termusage table is large – how can I reduce the select time.
Data is added to the termusage table once a month from CSV files using LOAD DATA method – so it doesn’t need to be quite so tuned for inserts.
EDIT : Show explain on main query :
id,select_type,table,type,possible_keys,key,key_len,ref,rows,Extra
1,SIMPLE,c,const,PRIMARY,PRIMARY,8,const,1,"Using index; Using temporary; Using filesort"
1,SIMPLE,u,ALL,"termid_idx,date_idx",NULL,NULL,NULL,29656469,"Using where"
1,SIMPLE,t,eq_ref,"PRIMARY,customer_idx",PRIMARY,8,wlnew.u.termid,1,"Using where"
Looks like you’re asking two questions – correct?
The most likely reason the first query is taking so long is because it’s IO-bound. It takes a long time to transfer 14 million records from disk and down the wire to your MySQL work bench.
Have you tried putting the second query though “explain”? Yes, you only get back 8 rows – but the SUM operation may be summing millions of records.
I’m assuming the “customer” and “terminal” tables are appropriately indexed? As you’re joining on the primary key on termusage, that should be really quick…