Currently I have two different queries that return exactly the same results however, changing

Question

0

Asked: May 20, 20262026-05-20T09:56:41+00:00 2026-05-20T09:56:41+00:00

Currently I have two different queries that return exactly the same results however, changing

0

Currently I have two different queries that return exactly the same results however, changing the parameter from which the results are being filtered make them behave in a very different manner.

Results when searching for `cartography`

query #1: 22 rows / ~860ms;

SELECT eid FROM t_entidades 
WHERE  eid IN ( 
            SELECT     eid 
            FROM       t_entidades 
            WHERE      entidade_t LIKE '%cartography%'
)
OR     eid IN (
            SELECT    entidade as eid
            FROM      t_entidade_actividade ea
            LEFT JOIN t_actividades a ON a.aid = ea.actividade
            WHERE     a.actividade LIKE '%cartography%'
)

query #2: 22 rows / ~430ms;

SELECT      eid FROM t_entidades WHERE entidade_t LIKE '%cartography%'
UNION
SELECT      entidade as eid
FROM        t_entidade_actividade ea
LEFT JOIN   t_actividades a ON a.aid = ea.actividade
WHERE       a.actividade LIKE '%cartography%'

Results when searching for `cart`

query #1: 715 rows / ~870ms;

query #2: 715 rows / ~450ms

Results when searching for `car`

query #1: never waited long enough… it seems it takes forever and over 1s would be too much

-- EXPLAIN OUTPUT:
"QUERY PLAN"
"Seq Scan on t_entidades  (cost=44997.40..219177315.47 rows=500127 width=4)"
"  Filter: ((SubPlan 1) OR (hashed SubPlan 2))"
"  SubPlan 1"
"    ->  Materialize  (cost=37712.46..38269.55 rows=40009 width=4)"
"          ->  Seq Scan on t_entidades  (cost=0.00..37515.45 rows=40009 width=4)"
"                Filter: ((entidade_t)::text ~~ '%car%'::text)"
"  SubPlan 2"
"    ->  Hash Join  (cost=36.48..7284.20 rows=298 width=4)"
"          Hash Cond: (ea.actividade = a.aid)"
"          ->  Seq Scan on t_entidade_actividade ea  (cost=0.00..5826.63 rows=378163 width=8)"
"          ->  Hash  (cost=36.46..36.46 rows=1 width=4)"
"                ->  Seq Scan on t_actividades a  (cost=0.00..36.46 rows=1 width=4)"
"                      Filter: ((actividade)::text ~~ '%car%'::text)"

query #2: 23661 rows / ~860ms

-- EXPLAIN OUTPUT:
"QUERY PLAN"
"HashAggregate  (cost=45303.48..45706.55 rows=40307 width=4)"
"  ->  Append  (cost=0.00..45202.72 rows=40307 width=4)"
"        ->  Seq Scan on t_entidades  (cost=0.00..37515.45 rows=40009 width=4)"
"              Filter: ((entidade_t)::text ~~ '%car%'::text)"
"        ->  Hash Join  (cost=36.48..7284.20 rows=298 width=4)"
"              Hash Cond: (ea.actividade = a.aid)"
"              ->  Seq Scan on t_entidade_actividade ea  (cost=0.00..5826.63 rows=378163 width=8)"
"              ->  Hash  (cost=36.46..36.46 rows=1 width=4)"
"                    ->  Seq Scan on t_actividades a  (cost=0.00..36.46 rows=1 width=4)"
"                          Filter: ((actividade)::text ~~ '%car%'::text)"

So, searching car using query #1 seems to take forever… Which is funny considering that SELECT eid FROM t_entidades takes only around 4s returning all 350k+ rows…

The only difference between EXPLAINs for query #1 at the different steps is that for car the following row appears: ” -> Materialize (cost=37712.46..38269.55 rows=40009 width=4)”

If someone would care to explain why query #1 takes so long to execute at the last example and exactly what’s happening at each step of the explain, it would be very appreciated because I never seem to get it…

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-20T09:56:42+00:00

The plan for query#1 reads to me as:

scan t_entidades, and for each row:
1. execute subplan 1 by scanning a materialized subset (temp file?) from t_entidades
2. execute subplan 2 by examining a hash table built from scanning t_entidade_actividade

An “explain analyze” would be able to tell you how often steps 1.1 and 1.2 were actually run for the query… if the scan in step 1.1 is being done for each row from step 1, then your query time will grow O(n^2) where n is the number of rows in t_entidades, and the temp space used for each iteration of 1.1 will increase as the number of matches in that table increases.

Your query 2 is much better written, IMHO. Each of the two sets of IDs are produced in quite different ways, so put them in separate queries and use a UNION to merge them together at the end. It also cuts out the useless outer scan of t_entidades in query 1 that just passes through IDs from the where clause. (Not that it’s relevant to PostgreSQL, but it also makes it clear that the two scans could be run in parallel and then merged, but never mind).

t_entidade_actividade.actividade might need an index?

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Currently I have two different queries that return exactly the same results however, changing

Results when searching for cartography

Results when searching for cart

Results when searching for car

Leave an answerCancel reply

1 Answer

Results when searching for `cartography`

Results when searching for `cart`

Results when searching for `car`

Leave an answer
Cancel reply