This is what happens when postgresql tries to start after a power failure:
2012-01-27 18:00:44 MSK LOG: database system was interrupted while in recovery at 2012-01-27 18:00:16 MSK
2012-01-27 18:00:44 MSK HINT: This probably means that some data is corrupted and you will have to use the last backup for recovery.
2012-01-27 18:00:44 MSK LOG: database system was not properly shut down; automatic recovery in progress
2012-01-27 18:00:44 MSK LOG: consistent recovery state reached at 17/762C39B8
2012-01-27 18:00:44 MSK LOG: redo starts at 17/761F6A40
2012-01-27 18:00:44 MSK FATAL: invalid page header in block 311757 of relation base/26976/27977
2012-01-27 18:00:44 MSK CONTEXT: xlog redo insert: rel 1663/26976/27977; tid 311757/44
2012-01-27 18:00:44 MSK LOG: startup process (PID 392) exited with exit code 1
2012-01-27 18:00:44 MSK LOG: aborting startup due to startup process failure
I know I’m not out of luck and there is a command I can use to repair the database in this situation. It doesn’t matter if the last few hours of transactions disappear, as long as the database becomes consistent.
Please advice me on what to do in this situation.
Depends on how much you’re willing to give up on data side.
You can set
zero_damaged_pagesto on in your postgresql.conf configuration file and then give it a try – but that will cause data loss. It might work or might not work.If you want to try that, always start by shutting down the postgres database and taking a full filesystem copy of it (e.g. tar). Because it might still be the least broken version you have. Then once you’ve set it, run a
pg_dumpright away, wipe the cluster, and restore the dump. And absolutely do not run the new cluster withzero_damaged_pageson by default, remember to turn it back off!And then set up proper Point-In-Time Recovery backups for the new cluster.