In PostgreSQL, when a combination of multiple columns is specified as the PRIMARY KEY, how are the records ordered?
This is with the assumption that PostgreSQL orders the records in the order of the primary key. Does it?
Also, is the primary key automatically indexed in case of PostgreSQL?
This question makes the misguided assumption that the primary key imposes a table order at all. It doesn’t. PostgreSQL tables have no defined order, with or without a primary key; they’re a “heap” of rows arranged in page blocks. Ordering is imposed using the
ORDER BYclause of queries when desired.You might be thinking that PostgreSQL tables are stored as index-oriented tables that’re stored on disk in primary key order, but that isn’t how Pg works. I think InnoDB stores tables organized by the primary key (but haven’t checked), and it’s optional in some other vendors’ databases using a feature often called “clustered indexes” or “index-organized tables”. This feature isn’t currently supported by PostgreSQL (as of 9.3 at least).
That said, the
PRIMARY KEYis implemented using aUNIQUEindex, and there is an ordering to that index. It is sorted in ascending order from the left column of the index (and therefore the primary key) onward, as if it wereORDER BY col1 ASC, col2 ASC, col3 ASC;. The same is true of any other b-tree (as distinct from GiST or GIN) index in PostgreSQL, as they’re implemented using b+trees.So in the table:
the system will automatically create the equivalent of:
This is reported to you when you create a table, eg:
You can see this index when examining the table:
You can
CLUSTERon this index to re-order the table according to the primary key, but it’s a one-time operation. The system won’t maintain that ordering – though if there’s space free in the pages due to a non-defaultFILLFACTORI think it will try to.One consequence of the inherent ordering of the index (but not the heap) is that it is much faster to search for:
than:
and neither of these can use the primary key index at all, they’ll do a seqscan unless you have an index on
b:This is becaues PostgreSQL can use an index on
(a,b)almost as fast as an index on(a)alone. It cannot use an index on(a,b)as if it were an index on(b)alone – not even slowly, it just can’t.As for the
DESCentry, for that one Pg must do a reverse index scan, which is slower than an ordinary forward index scan. If you’re seeing lots of reverse index scans inEXPLAIN ANALYZEand you can afford the performance cost of the extra index you can create an index on the field inDESCorder.This is true for
WHEREclauses, not justORDER BY. You can use an index on(a,b)to search forWHERE a = 4orWHERE a = 4 AND b = 3but not to search forWHERE b = 3alone.