We’re trying to add a unique constraint to our postgres table in a way that deletes the duplicates rather than throwing an error. The unique constraint spans two columns, and there is no primary key. For example:
i_id | term | date_created
1 | 'mako' | 123456789
1 | 'mako' | 123451234
1 | 'tele' | 213456852
2 | 'rake' | 598521542
So, in this example, we would need to remove that second row before we could safely add the unique constraint. Normally we would do a delete command with a select distinct thrown in, but we don’t have any distinguishing key for the rows. Specifically, the unique key would be over the columns [i_id, term].
(WTF didn’t we have a unique constraint from the start? Go figure.)
I’m thinking a delete statement would be the best, but I can’t simply write
delete from table where row_id not in (select row_id ... distinct something ... )
because there is no primary key for the row. I’d rather avoid a temporary table, if possible. Any suggestions?
EDIT: Sorry. We’re using postgres 8.4.
EDIT 2: The solution we’re using is:
delete from table where ctid not in (
select
distinct on (i_id, term)
ctid
from table
order by i_id, term
);
Thanks guys!
1 Answer