Every night I need to trim back a table to only contain the latest 20,000 records. I could use a subquery:
delete from table WHERE id NOT IN (select TOP 20000 ID from table ORDER BY date_added DESC)
But that seems inefficient, especially if we later decide to keep 50,000 records. I’m using SQL 2005, and thought I could use ROW_NUMBER() OVER somehow to do it? Order them and delete all that have a ROW_NUMBER greater than 20,000? But I couldn’t get it to work. Is the subquery my best bet or is there a better way?
If it just seems inefficient, I would make sure it is inefficient before I start barking up the wrong tree.
Measure the time, cpu usage, disk I/O, etc. to see how well it performs. I think you’ll find it performs better than you think.