Using Postgres v. 8.4.2. I have a standard three-table setup for a many-to-many relationship.

Question

0

Asked: May 26, 20262026-05-26T22:48:19+00:00 2026-05-26T22:48:19+00:00

Using Postgres v. 8.4.2. I have a standard three-table setup for a many-to-many relationship.

0

Using Postgres v. 8.4.2.

I have a standard three-table setup for a many-to-many relationship. A query, written against two of these tables, is supposed to return a single record for which there is no existing mapping:

SELECT b.id
   FROM scm_branch b
      LEFT OUTER JOIN "scm_branchgroup_branches" bgb
         ON b.id = bgb.branch_id
   WHERE
         bgb.branch_id is NULL
   LIMIT 1

(scm_branchgroup_branches is the ternary table keeping the mapping records)

this query has the following top level cost:

Limit (cost=0.00..0.22 rows=1 width=4)

However if I change the field used in the WHERE condition to be the primary key of the bgb table, the cost goes drastically up:

SELECT b.id
   FROM scm_branch b
      LEFT OUTER JOIN "scm_branchgroup_branches" bgb
         ON b.id = bgb.branch_id
   WHERE
         bgb.id IS NULL 
   LIMIT 1

(note: this query uses “bgb.id IS NULL” vs “bgb.branch_id is NULL” used in the previous example)

Top level cost:

Limit (cost=236366.74..4644900.75 rows=1 width=4)

Why such a drastic difference? I mean I see the difference in execution plans, but I don’t understand the underlying reasons for the difference.

I’ll be happy to provide any additional info needed to answer this question.

Thanks
D.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-26T22:48:20+00:00

In the version with branch_id IS NULL, PostgreSQL recognizes that you’re doing an anti-join, and it optimizes accordingly, using a “hash-join”-type algorithm. (Google PostgreSQL hash left anti-join for lots of information on this optimization.) In the version with id IS NULL, however, it doesn’t recognize this fact, and its optimizations are not nearly so helpful. What’s more, the join basically forces it to use the index on branch_id, so it can’t take advantage of the primary-key index on id, and must consult the actual table data to determine if a given record satisfies the WHERE-clause.

(There may well be other factors at play as well — I’m not an expert on PostgreSQL — but I believe those are the main factors.)

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Using Postgres v. 8.4.2. I have a standard three-table setup for a many-to-many relationship.

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply