I have PostgreSQL 9.2 and MySQL 5.5 (InnoDB) installed on my laptop.
Both database engines using default installation and populated from the same CSV file.
I have ‘sales_reports’ table with ca. 700K rows.
Scenario 1:
-
following query:
select name, year, region, branch from sales_reports group by name,
year, region, branch; -
PostgreSQL 9.2: Total query runtime: 42.14 sec, 18064 rows retrieved
- PostgreSQL explain:
Group (cost=165091.16..174275.61 rows=73476 width=58) (actual time=35196.959..41896.739 rows=18064 loops=1) -> Sort (cost=165091.16..166928.05 rows=734756 width=58) (actual time=35196.956..41704.549 rows=734756 loops=1) Sort Key: name, year, region, branch Sort Method: external merge Disk: 49920kB -> Seq Scan on sales_reports (cost=0.00..38249.56 rows=734756 width=58) (actual time=0.048..282.331 rows=734756 loops=1) Total runtime: 41906.628 ms - MySQL 5.5 : Total query runtime: 4.4 sec, 18064 rows retrieved
- MySQL explain:
+----+-------------+---------------+------+---------------+------+---------+------+--------+---------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+---------------+------+---------------+------+---------+------+--------+---------------------------------+ | 1 | SIMPLE | sales_reports | ALL | NULL | NULL | NULL | NULL | 729433 | Using temporary; Using filesort | +----+-------------+---------------+------+---------------+------+---------+------+--------+---------------------------------+
- PostgreSQL 10x times slower
Scenario 2:
- following query:
select name, year, region, branch, sum(sale) as sale from sales_reports group by name, year, region, branch;
- PostgreSQL 9.2: Total query runtime: 42.51 sec, 18064 rows retrieved
- PostgreSQL explain:
GroupAggregate (cost=165091.16..176847.26 rows=73476 width=64) (actual time=35160.911..42254.060 rows=18064 loops=1) -> Sort (cost=165091.16..166928.05 rows=734756 width=64) (actual time=35160.489..41857.986 rows=734756 loops=1) Sort Key: name, year, region, branch Sort Method: external merge Disk: 54760kB -> Seq Scan on sales_reports (cost=0.00..38249.56 rows=734756 width=64) (actual time=0.047..296.347 rows=734756 loops=1) Total runtime: 42264.790 ms - MySQL 5.5 : Total query runtime: 8.15 sec, 18064 rows retrieved
- MySQL explain:
+----+-------------+---------------+------+---------------+------+---------+------+--------+---------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+---------------+------+---------------+------+---------+------+--------+---------------------------------+ | 1 | SIMPLE | sales_reports | ALL | NULL | NULL | NULL | NULL | 729433 | Using temporary; Using filesort | +----+-------------+---------------+------+---------------+------+---------+------+--------+---------------------------------+
- PostgreSQL 5x times slower
Scenario 3:
- following query:
select name, year, region, sum(sale) as sale from sales_reports group by name, year, region;
- PostgreSQL 9.2: Total query runtime: 1 sec, 18064 rows retrieved
- PostgreSQL explain:
HashAggregate (cost=45597.12..45655.62 rows=5850 width=37) (actual time=758.396..759.756 rows=4644 loops=1) -> Seq Scan on sales_reports (cost=0.00..38249.56 rows=734756 width=37) (actual time=0.061..116.541 rows=734756 loops=1) Total runtime: 760.133 ms
- MySQL 5.5 : Total query runtime: 5.8 sec, 18064 rows retrieved
- MySQL explain:
+----+-------------+---------------+------+---------------+------+---------+------+--------+---------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+---------------+------+---------------+------+---------+------+--------+---------------------------------+ | 1 | SIMPLE | sales_reports | ALL | NULL | NULL | NULL | NULL | 729433 | Using temporary; Using filesort | +----+-------------+---------------+------+---------------+------+---------+------+--------+---------------------------------+
- PostgreSQL 5x times faster
Any ideas why first two scenarios are so slow on PostgreSQL?
BTW, I created indexes for fields I’m using in the query on PostgreSQL, I didn’t create any indexes on MySQL.
Thanks,
Marek
Default postgresql config is rather conservative. For starters, try increasing
shared_buffersto 1GB. (Remember about restarting the server for the change to take effect.)Also, try increasing
work_memuntil the GroupAggregate switches to HashAggregate in the explain. You can change this without a restart.A word of warning: Before messing with the settings in production, please read the friendly manual, there are some gotchas involved.