I have a query that looks a bit like this (note: the actual query

Question

0

Asked: May 27, 20262026-05-27T00:34:03+00:00 2026-05-27T00:34:03+00:00

I have a query that looks a bit like this (note: the actual query

0

I have a query that looks a bit like this (note: the actual query is generated by Hibernate and is a bit more complicated):

select * from outage_revisions orev
join outages o
    on orev.outage=o.id
    where o.observed_end is null
    and orev.observation_date =
        (select max(observation_date)
            from outage_revisions orev2
            where orev2.observation_date <= '2011-11-21 00:00:00'
            and orev2.outage = orev.outage);

This query runs very slowly (about 15 minutes). However, if I take out the part of the where clause with the subquery, it comes back almost instantly (about 83 milliseconds) with only about 14 rows.

Furthermore, the subquery itself is very fast (about 31 milliseconds):

select max(observation_date) from outage_revisions orev2
where orev2.observation_date <= '2011-11-21 00:00:00'
and orev2.outage = 1

My question is this: if there are only 14 rows returned from the full query excluding the subquery filter, why does adding the subquery slow down the query so much? Should not the subquery add at most approximately 31*14 milliseconds?

Here is the plan for the full query:

Nested Loop  (cost=0.00..71078813.16 rows=1 width=115)
   ->  Seq Scan on outagerevisions orev  (cost=0.00..71077624.67 rows=284 width=79)
         Filter: (observationdate = (SubPlan 2))
         SubPlan 2
           ->  Result  (cost=1250.56..1250.57 rows=1 width=0)
                 InitPlan 1 (returns $1)
                   ->  Limit  (cost=0.00..1250.56 rows=1 width=8)
                         ->  Index Scan Backward using idx_observationdate on outagerevisions orev2  (cost=0.00..2501.12 rows=2 width=8)
                               Index Cond: (observationdate <= '2011-11-21 00:00:00'::timestamp without time zone)
                               Filter: ((observationdate IS NOT NULL) AND (outage = $0))
   ->  Index Scan using outages_pkey on outages o  (cost=0.00..4.17 rows=1 width=36)
         Index Cond: (o.id = orev.outage)
         Filter: (o.observedend IS NULL)

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-27T00:34:04+00:00

My guess is that PostgreSQL is just making a poor choice on how it executes the query. Although it seems obvious that it should narrow down to the 9 rows before executing the correlated subquery, it’s probably not doing that, so the subquery has to be run 60,000 times. While it’s doing that it also has to track which rows will continue on to the next step, etc.

Here are a couple of other ways that you could try to write it:

SELECT
    <column list>
FROM
    Outage_Revisions OREV
JOIN Outages O ON
    OREV.outage = O.id
LEFT OUTER JOIN Outage_Revisions OREV2 ON
    OREV2.outage = OREV.outage AND
    OREV2.observation_date <= '2011-11-21 00:00:00' AND
    OREV2.observation_date > OREV.observation_date
WHERE
    O.observed_end IS NULL AND
    OREV2.outage IS NULL

or
(assuming that PostgreSQL and Hibernate support joining subqueries)

SELECT
    <column list>
FROM
    Outage_Revisions OREV
JOIN Outages O ON
    OREV.outage = O.id
JOIN (SELECT OREV2.outage, MAX(OREV2.observation_date) AS max_observation_date
      FROM Outage_Revisions OREV2
      WHERE OREV2.observation_date <= '2011-11-21 00:00:00'
      GROUP BY OREV2.outage) SQ ON
    SQ.outage = OREV.outage AND
    SQ.max_observation_date = OREV.observation_date
WHERE
    O.observed_end IS NULL

You can play around with the order of the joins in that last query.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a query that looks a bit like this (note: the actual query

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply