My database schema in relevant part is there is a table called User, which had a boolean field Admin. There was an index on this field Admin.
The day before I restored my full production database onto my development machine, and then made only very minor changes to the database, so they should have been very similar.
When I ran the following command on my development machine, I got the expected result:
EXPLAIN SELECT * FROM user WHERE admin IS TRUE;
Index Scan using index_user_on_admin on user (cost=0.00..9.14 rows=165 width=3658)
Index Cond: (admin = true)
Filter: (admin IS TRUE)
However, when I ran the exact same command on my production machine, I got this:
Seq Scan on user (cost=0.00..620794.93 rows=4966489 width=3871)
Filter: (admin IS TRUE)
So instead of using the exact index that was a perfect match for the query, it was using a sequential scan of almost 5 million rows!
I then tried to run EXPLAIN ANALYZE SELECT * FROM user WHERE admin IS TRUE; with the hope that ANALYZE would make Postgres realize a sequential scan of 5 million rows wasn’t as good as using the index, but that didn’t change anything.
I also tried to run REINDEX INDEX index_user_on_admin in case the index was corrupted, without any benefit.
Finally, I called VACUUM ANALYZE user and that resolved the problem in short order.
My main understanding of vacuum is that it is used to reclaim wasted space. What could have been going on that would cause my index to misbehave so badly, and why did vacuum fix it?
It was most likely the
ANALYZEthat helped, by updating the data statistics used by the planner to determine what would be the best way to run a query.VACUUM ANALYZEjust runs the two commands in order,VACUUMfirst,ANALYZEsecond, butANALYZEitself would probably be enough to help.The
ANALYZEoption toEXPLAINhas completely nothing to do with theANALYZEcommand. It just causes Postgres to run the query and report the actual run times, so that they can be compared with the planner predictions (EXPLAINwithout theANALYZEonly displays the query plan and what the planner thinks it will cost, but does not actually run the query). SoEXPLAIN ANALYZEdid not help because it did not update the statistics.ANALYZEandEXPLAIN ANALYZEare two completely different actions that just happen to use the same word.