Suppose you want to find the last record entered into the database (highest ID) matching a string: Model.where(:name => 'Joe'). There are 100,000+ records. There are many matches (say thousands).
What is the most efficient way to do this? Does PostgreSQL need to find all the records, or can it just find the last one? Is this a particularly slow query?
Working in Rails 3.0.7, Ruby 1.9.2 and PostgreSQL 8.3.
The important part here is to have a matching index. You can try this small test setup:
Create schema
xfor testing:Insert 10000 random rows:
Insert another 10000 rows with repeating names:
Delete random 10% to make it more real life:
Query can look like this:
–> Total runtime: 5.535 ms
–> Total runtime: 1.228 ms
–> Total runtime: 0.053 ms
–> Total runtime: 0.048 ms
–> Total runtime: 1.144 ms
–> Total runtime: 0.047 ms
Conclusion
With a fitting index, the query performs more than 100x faster.
Top performer is a multicolumn index with the filter column first and the sort column last.
Matching sort order in the index helps a little in this case.
Clustering helps with the simple index, because still many columns have to be read from the table, and these can be found in adjacent blocks after clustering. It doesn’t help with the multicolumn index in this case, because only one record has to be fetched from the table.
Read more about multicolumn indexes in the manual.
All of these effects grow with the size of the table. 10000 rows of two tiny columns is just a very small test case.