My question is similar to sql statement to delete records older than XXX as long as there are more than YY rows, but that question just deals with a single parent, I want to delete records for all parents in one go.
Consider this table:
CREATE TABLE Children
(
ChildId int NOT NULL,
ChildCreated datetime NOT NULL,
ParentId int NOT NULL
)
This could be any parent-child relationship, so the names are generic.
I would like to delete all children that are older than a month, but need to keep a minimum number of children for each parent regardless of their age.
I tried some statements with nested SELECTs and GROUP BYs which gave me some results but none gave me the correct result set.
Because I am using SQL Server I came up with the following solution that works great:
WITH CTE AS
(
SELECT ROW_NUMBER() OVER (Partition BY ParentId ORDER BY ChildCreated DESC)
As RowNo, ChildCreated FROM Children
)
DELETE FROM CTE WHERE RowNo > 10
AND RevisionCreated < DATEADD(MONTH,-1,GetDate())
The common table expression groups all children for each parent together and adds a continuous row number based on the creation order. The newest child for each parent has
a row number of 1, the tenth newest has 10. So I can just delete all records with a row
number greater 10 as long as they are also over a month old.
My question is, what if I have to do the same thing on a system where CTEs are not supported. What is the ANSI SQL-92 solution for this problem?
Based on other responses, and the relative simplicity of my query I think I might be over simplifying the issue, but I am assuming since parentID is not nullable that it does not reference childID, In which case it can be achieved as simply as the below
Although this exact SQL may need tweaking depending on RDBMS, I don’t know of any RDBMS where this principal cannot be applied.