I have a SQL Server database running in Full recovery. I need to remove data (around 30-40 million records) but I cannot take the database offline as it’s in constant use. I also cannot switch it to Simple recovery mode incase anything happens and we lose live data. When I try to remove the data in small chunks (around 2 million rows), the transaction log becomes extremely large and causes the process to become extremely slow. Due to back-up jobs running at night, I only have a small timeframe.
Does anyone have any thoughts on how I can do this? I thought about copying the table into another database (in Simple recovery mode) and then remove the data. Is this a good idea?
There are 3 tables in question. Campaign, Events and Targets. Its the Events table that has the millions of records in and this is what takes the time to delete. The all have the necessary relations via Id columns.
You have to use small chunks otherwise your transaction log will increases
Every one one of 30-40 million deletes will be logged. If you create a new table and copy “to keep” rows you’ll still have 50+ million logged rows. The fact of simple vs full recovery doesn’t matter: each delete/insert is logged
If the log increases in simple recovery then I suspect you are doing it in a transaction. So the 30-40 million deletes are still logged, even in simple recovery because it would all have to be rolled back perhaps.
For 40 x 1 million deletes without a transaction in simpler recovery you can use CHECKPOINT to assist in log tidy up
See Bulk DELETE on SQL Server 2008 (Is there anything like Bulk Copy (bcp) for delete data?) for more
But something like:
Process:
You don’t have many other options if you insist on deleting 30+ million rows in one go in a short windows…