We are running Postgres 9.1.3 and we have recently started to run into major performance problems on one of our servers.
Our queries ran fine for a while, but as of August 1st, they have slowed down dramatically. It would appear that most of the problematic queries are Select queries (queries with count(*) are especially bad), but in general, the database is just running really slow.
We ran this query on the server and these were the changes that we have made to the default config file (Note: The server ran fine with these changes before, so, they likely don’t matter much) :
name | current_setting
---------------------------+---------------------------------------------------------------------------------------------------------------
version | PostgreSQL 9.1.2 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-51), 64-bit
autovacuum | off
bgwriter_delay | 20ms
checkpoint_segments | 6
checkpoint_warning | 0
client_encoding | UTF8
default_statistics_target | 1000
effective_cache_size | 4778MB
effective_io_concurrency | 2
fsync | off
full_page_writes | off
lc_collate | en_US.UTF-8
lc_ctype | en_US.UTF-8
listen_addresses | *
maintenance_work_mem | 1GB
max_connections | 100
max_stack_depth | 2MB
port | 5432
random_page_cost | 2
server_encoding | UTF8
shared_buffers | 1792MB
synchronous_commit | off
temp_buffers | 16MB
TimeZone | US/Eastern
wal_buffers | 16MB
wal_level | minimal
wal_writer_delay | 10ms
work_mem | 16MB
(28 rows)
Time: 210.231 ms
Normally, when problems like this arise, the first thing people recommend is vacuuming and we have tried that. We vacuum analyzed most of the database, but it didn’t help.
We used Explain on some of our queries and noticed that Postgres was resorting to sequential scans even though the tables had indexes.
We turned sequential scan off to force the query planner into using indexes, but that did not help either.
We then tried out this query to see if we had a lot of unused diskspace that Postgres was going through in order to find what it is looking for. Unfortunately, while some of our tables did have a bit of bulk, it did not seem significant enough to slow down overall system performance.
We think the slowdown might be I/O related, but we can’t figure out the specifics. Is Postgres just being silly and if so, what part of it? Is there something wrong with the VM, or perhaps something wrong with the physical hardware itself?
Do you guys have any other suggestions for things that we can try or check out?
EDIT:
I am so sorry for not updating this sooner. I got caught up in other things.
On this particular machine, our performance greatly improved by making one small modification to the Virtual Machine’s settings.
There is a setting that deals with IO caching. It was originally set to to ON. We figured that constantly caching things was slowing things down and we were right. We turned it OFF, and things improved drastically.
Interestingly enough most of our other servers already had this setting turned off.
There are other issues, and I am sure we will take a lot of your suggestions, so, thanks a lot for helping.
It’s difficult to be sure, but I think you are right to be suspicious of I/O issues. What can happen is that as tables get larger or connections are increased then cache hits start to fall. That increases I/o demands and slows everything down. Meanwhile, more queries arrive, making the problem worse. The situation is complicated for you because virtual disks don’t necessarily behave the same as physical ones.
Firstly you will need to measure actual activity on the VM (through vmstat or iostat perhaps). Secondly, do the same on the real hardware. Finally, run some standard disk bandwidth tools on both (in particular random read/write mixes). Now you’ll be able to say how much of your available I/o is being used.
As for query plans, without the schema details and explain analyse output no-one can say.
You will find the postgresql.org mailing list useful even if just for the archives. Also, the book linked below is excellent.
http://www.packtpub.com/postgresql-90-high-performance/book