I need to create a Postgres 9.1 PL/pgSQL stored procedure that, among other parameters, takes a sequence of values that directly reference values in one of my database columns. As far as I can tell, the canonical way to do this in Postgres is an array.
This is a rather basic task, of course. My problem is scalability: My code basically works, but performs badly once the sequences passed in get large (as in a few hundreds or thousands of values):
Even rather simple SELECT statements within my stored procedure using the array in the form
SELECT <some columns>
FROM <some tables>
WHERE <some other select criteria>
AND <column with values selected by array parameter>
IN (SELECT * FROM unnest(<array parameter>))
take several seconds to execute even though the database is not very large yet and there are only tens of values in the array.
My first suspicion was that unnest(...) is the problem, but selecting only from the table with the column referenced in the array parameter is really fast:
SELECT <some columns>
FROM <table with column ref'd in array parameter>
WHERE <column with values selected by array parameter>
IN (SELECT * FROM unnest(<array parameter>))
only takes a few milliseconds.
My questions:
- Is there an alternative to using an array as parameter ?
- How can I make my queries perform better ?
I would expect faster performance if you rewrite your query
to:
It sounds like the query planner chooses a suboptimal plan, misjudging the cost of your other
WHEREcriteria in comparison to the IN clause. By transforming it to an explicitJOINclause you should get a better query plan.Generally,
JOINs tend to be faster than largeINclauses in PostgreSQL.Yes.
You could create temporary table, fill it and run a query joining against it.
Or, faster yet, use a CTE for the same purpose:
As long as you want to use a function, an array parameter, unnested inside would be my choice, too. You could also use the CTE in my last example inside a function, just with unnest(arr) instead of a VALUES clause.