I use Postgresql 9.1. I have a table and I would like to get two things done in one query:
First: select one random sample row:
select * from table order by random() limit 1;
Second: count the number of rows the sample was selected from:
select count(1) from table;
Since a proper random() function has to know about all rows, I wonder if there is a way to get both results in one query.
I am guessing for something link:
select count(1), first.* from table order by random();
but i don’t really know how to do it.
So I want a random row and the total number of rows.
Thanks for any ideas.
Steve
One way to do this:
If the table is huge and performance is an issue, there are tricks to speed up the random selection. Having a serial column without gaps would help a lot with that …
If that is for you, have a look at this related answer: Random Row in MySQL
Works in PostgreSQL, too. Only substitute
random()forrand().BTW, the
random()function works absolutely independent of the number of rows.Here is a variant that produces the same with only one sequential scan:
A quick test with a table of 5k rows was slower, though.
Another one with a table of 400k rows shows a similar result. slightly slower. Window functions can be performance hogs. And count(*) is known to be rather slow for large tables in PostgreSQL, because every tuple has to be visited.
Index-only scans (“covering indexes”) has recently been implemented in the devel version of 9.2. So we are in for a speedup with the next version here. More about that in a related post on dba.stackexchange.com:
In the meantime, 230 ms for the 400k table is not too bad.