I have a function that executes a query with ORDER BY clause. When I call this function, postgres gets stuck, but when I execute the query with the values I pass to the function, it responds immediately.
The function looks like:
CREATE OR REPLACE FUNCTION get_all_synset_tags_by_name(
lim int,
offs int
) RETURNS SETOF ctags AS $$
SELECT concepts.id, expressions.name,
array_to_string(
array(
SELECT descr FROM ctags WHERE id IN (SELECT unnest(
concept_get_expr_synonyms(concepts.id)
))),'; '
), (
SELECT sum(freq) FROM ctags WHERE id IN (SELECT unnest(
concept_get_expr_synonyms(concepts.id)
)))::integer
FROM concepts,expressions
WHERE
concepts.is_dropped=FALSE AND expressions.is_dropped=FALSE AND expressions.id=concepts.expr_id AND
concepts.id=(concept_get_expr_synonyms(concepts.id))[1]
ORDER BY name
LIMIT $1 OFFSET $2;
$$ language sql STABLE;
CREATE OR REPLACE FUNCTION get_top_tags_all_name(
user_id int,
lim int,
offs int,
at bool,
dt bool,
mt bool
)
RETURNS
TABLE(
id int,
name text,
descr text,
freq int,
foll_status bool,
ad_perm bool,
rm_perm bool,
ed_perm bool,
n_folls bigint,
n_quests bigint,
n_opins bigint
) AS $$
SELECT ctags.id,ctags.name,ctags.descr,ctags.freq,
(SELECT users_tags.is_ignored
FROM users_tags
WHERE users_tags.tag=ctags.id AND users_tags.follower=$1),
$4,$5,$6,
(SELECT count(tag) FROM users_tags
WHERE users_tags.tag=ctags.id AND users_tags.is_ignored=FALSE),
(SELECT count(question_tags.question_id) FROM question_tags
LEFT JOIN questions ON question_tags.question_id=questions.id
WHERE question_tags.tag_id=ctags.id AND questions.qtype='quest' ),
(SELECT count(question_tags.question_id) FROM question_tags
LEFT JOIN questions ON question_tags.question_id=questions.id
WHERE question_tags.tag_id=ctags.id AND questions.qtype='opin' )
FROM get_all_synset_tags_by_name($2,$3) AS ctags;
$$ language sql STRICT;
When I execute the query from get_top_tags_all_name with EXPLAIN, it says:
qa=# EXPLAIN SELECT ctags.id,ctags.name,ctags.descr,ctags.freq,
qa-# (SELECT users_tags.is_ignored
qa(# FROM users_tags
qa(# WHERE users_tags.tag=ctags.id AND users_tags.follower=1),
qa-# FALSE,FALSE,FALSE,
qa-# (SELECT count(tag) FROM users_tags
qa(# WHERE users_tags.tag=ctags.id AND users_tags.is_ignored=FALSE),
qa-# (SELECT count(question_tags.question_id) FROM question_tags
qa(# LEFT JOIN questions ON question_tags.question_id=questions.id
qa(# WHERE question_tags.tag_id=ctags.id AND questions.qtype='quest' ),
qa-# (SELECT count(question_tags.question_id) FROM question_tags
qa(# LEFT JOIN questions ON question_tags.question_id=questions.id
qa(# WHERE question_tags.tag_id=ctags.id AND questions.qtype='opin' )
qa-# FROM get_all_synset_tags_by_name(10,0) AS ctags;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
Subquery Scan ctags (cost=0.00..21323.00 rows=10 width=72)
-> Limit (cost=0.00..20434.38 rows=10 width=36)
-> Nested Loop (cost=0.00..19645616.99 rows=9614 width=36)
-> Index Scan using expressions_name_key on expressions (cost=0.00..996449.61 rows=1504967 width=36)
Filter: (NOT is_dropped)
-> Index Scan using concepts_expr_id_idx on concepts (cost=0.00..12.08 rows=1 width=8)
Index Cond: (public.concepts.expr_id = public.expressions.id)
Filter: ((NOT public.concepts.is_dropped) AND (public.concepts.id = (concept_get_expr_synonyms(public.concepts.id))[1]))
SubPlan 5
-> Nested Loop (cost=0.28..23.65 rows=1 width=43)
-> Nested Loop (cost=0.28..17.63 rows=1 width=47)
-> Nested Loop (cost=0.28..8.96 rows=1 width=8)
-> HashAggregate (cost=0.28..0.29 rows=1 width=4)
-> Result (cost=0.00..0.26 rows=1 width=0)
-> Index Scan using tags_conc_id_key on tags (cost=0.00..8.66 rows=1 width=4)
Index Cond: (public.tags.conc_id = (unnest(concept_get_expr_synonyms($1))))
-> Index Scan using concepts_id_key1 on concepts (cost=0.00..8.66 rows=1 width=51)
Index Cond: (public.concepts.id = public.tags.conc_id)
-> Index Scan using expressions_pkey on expressions (cost=0.00..6.01 rows=1 width=4)
Index Cond: (public.expressions.id = public.concepts.expr_id)
SubPlan 6
-> Aggregate (cost=23.66..23.67 rows=1 width=4)
-> Nested Loop (cost=0.28..23.65 rows=1 width=4)
-> Nested Loop (cost=0.28..17.63 rows=1 width=8)
-> Nested Loop (cost=0.28..8.96 rows=1 width=12)
-> HashAggregate (cost=0.28..0.29 rows=1 width=4)
-> Result (cost=0.00..0.26 rows=1 width=0)
-> Index Scan using tags_conc_id_key on tags (cost=0.00..8.66 rows=1 width=8)
Index Cond: (public.tags.conc_id = (unnest(concept_get_expr_synonyms($1))))
-> Index Scan using concepts_id_key1 on concepts (cost=0.00..8.66 rows=1 width=8)
Index Cond: (public.concepts.id = public.tags.conc_id)
-> Index Scan using expressions_pkey on expressions (cost=0.00..6.01 rows=1 width=4)
Index Cond: (public.expressions.id = public.concepts.expr_id)
SubPlan 1
-> Index Scan using users_tags_tag_key on users_tags (cost=0.00..8.27 rows=1 width=1)
Index Cond: ((tag = $0) AND (follower = 1))
SubPlan 2
-> Aggregate (cost=14.90..14.91 rows=1 width=4)
-> Bitmap Heap Scan on users_tags (cost=4.33..14.88 rows=5 width=4)
Recheck Cond: (tag = $0)
Filter: (NOT is_ignored)
-> Bitmap Index Scan on users_tags_tag_key (cost=0.00..4.33 rows=10 width=0)
Index Cond: (tag = $0)
SubPlan 3
-> Aggregate (cost=32.83..32.84 rows=1 width=4)
-> Nested Loop (cost=0.00..32.82 rows=2 width=4)
-> Seq Scan on questions (cost=0.00..16.25 rows=2 width=4)
Filter: (qtype = 'quest'::quest_type)
-> Index Scan using question_tags_question_id_key on question_tags (cost=0.00..8.27 rows=1 width=4)
Index Cond: ((public.question_tags.question_id = public.questions.id) AND (public.question_tags.tag_id = $0))
SubPlan 4
-> Aggregate (cost=32.83..32.84 rows=1 width=4)
-> Nested Loop (cost=0.00..32.82 rows=2 width=4)
-> Seq Scan on questions (cost=0.00..16.25 rows=2 width=4)
Filter: (qtype = 'opin'::quest_type)
-> Index Scan using question_tags_question_id_key on question_tags (cost=0.00..8.27 rows=1 width=4)
Index Cond: ((public.question_tags.question_id = public.questions.id) AND (public.question_tags.tag_id = $0))
(57 rows)
So when I call this select it responds immediately.
But when I call the function, it gets stuck:
SELECT * FROM get_top_tags_all_name(1,10,0,FALSE,FALSE,FALSE);
^CCancel request sent
ERROR: canceling statement due to user request
CONTEXT: PL/pgSQL function "concept_get_expr_synonyms" line 6 at assignment
SQL function "get_top_tags_all_name" statement 1
(I had to cancel this request because it thinked more than 5 minutes)
So I suppose that the query planner didn’t rely on the index that time. Any help would be appreciated.
May I answer my own question?
The solution is to
This directive, being issued before calling the functions (or at the beginning of the connection), disables explicit sorts in cases when other methods are available.
By default, postgresql tries to optimize big index scans on small tables by explicitly sorting the data. If the table is small, and the majority of data is needed, then explicit sort is faster than index scan. The thing is that in my case the tables are huge, and small parts of data are needed, but postgresql doesn’t know it when it prepares my functions.