Sorry for a pretty specific question.
I have a table (see bottom), and when I try to delete a lot of records from it, my PostgreSQL 8.2.5 spends 98% of the time doing the parent-child constraint.
I’m trying to figure out what index should I add to make it go fast.
I have to say that all columns on this table have either 0 or null as the parent_block_id: it’s rudimentary.
I’ve tried adding different indexes: just (parent_block_id); WHERE parent_block_id = 0; WHERE parent_block_id IS NULL; WHERE parent_block_id != 0. Neither of those resulted in a serious perfomance benefit.
varshavka=> explain analyze delete from infoblocks where template_id = 112;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------
Seq Scan on infoblocks (cost=0.00..1234.29 rows=9 width=6) (actual time=13.271..40.888 rows=40000 loops=1)
Filter: (template_id = 112)
Trigger for constraint $1: time=4051.219 calls=40000
Trigger for constraint $2: time=1616.194 calls=40000
Trigger for constraint cs_ibrs: time=2810.144 calls=40000
Trigger for constraint cs_ibct: time=4026.305 calls=40000
Trigger for constraint cs_ibbs: time=3517.640 calls=40000
Trigger for constraint cs_ibreq: time=774344.010 calls=40000
Total runtime: 790760.168 ms
(9 rows)
varshavka=> \d infoblocks
Table "public.infoblocks"
Column | Type | Modifiers
-----------------+-----------------------------+------------------------------------------------------
id | integer | not null default nextval(('IB_SEQ'::text)::regclass)
parent_block_id | integer |
nm_id | integer | default 0
template_id | integer | not null
author_id | integer |
birthdate | timestamp without time zone | not null
Indexes:
"infoblocks_pkey" PRIMARY KEY, btree (id)
"zeroparent" btree (parent_block_id) WHERE parent_block_id <> 0
Foreign-key constraints:
"$2" FOREIGN KEY (nm_id) REFERENCES newsmakers(nm_id) ON DELETE RESTRICT
"$5" FOREIGN KEY (author_id) REFERENCES users(user_id) ON DELETE RESTRICT
"cs_ibreq" FOREIGN KEY (parent_block_id) REFERENCES infoblocks(id) ON DELETE CASCADE
First of all: the first (zeroth!) thing you should do when noticing ugly query times is make sure that you have
VACUUM ANALYZEd recently.If you just need a one-off deletion, then see araqnid’s answer. But if you need something that will continue to work in the future when some rows have a nonzero, non-null
parent_block_idfield, read on.I’m guessing that PostgreSQL doesn’t combine the deletions caused by
ON DELETE CASCADEinto a single query — the fact that theEXPLAINoutput shows these as triggers suggests that each child row deletion will in fact be performed separately. Presumably each row will be found using indexed lookup onparent_block_id, but that’s still going to be much slower than a single sweep through the table.So, you could probably get a big speedup by changing the
ON DELETE CASCADEtoON DELETE RESTRICT, and manually compiling a list of all deletions that need to be performed in a temporary table, then deleting them all at once. This approach will be very fast if the maximum depth of your hierarchy is small. Here’s some pseudocode:(I’m not sure, but you may in fact need to use
ON DELETE NO ACTIONinstead ofON DELETE RESTRICTfor the finalDELETEto succeed — it’s not clear to me whether a singleDELETEstatement is allowed to delete a parent and all its descendents whenON DELETE RESTRICTis in effect. If that’s unacceptable for some reason, you could always loop through multipleDELETEstatements, first deleting the bottommost level, then the next-bottommost and so on.)