We have a PostgreSQL-driven database for our information system we are currently developing. Since the issue of deletions is something I’ve been reluctant to solve completely, it starts to bother me as the project already started, the database is slowly being filled and there comes a point when users will actually want to delete the non-relevant data.
In our case, what will get deleted by users are some kind of ‘jobs’ we do for our clients. Once a job has been finished, users usually do not want to have it listed on the web page, so they will delete it. At first (when the system was in testing phase so no harm could have been done), user’s delete was a real DELETE from the database. Because it was nicely set to cascade to the very bottom of our entity graph, it really deleted everything. And also took a lot of time. Now that we started using the system for real, I was afraid of accidental deletions and I made it impossible for users to delete anything.
I think that the most important question is “What exactly does ‘deletion’ of a job mean in our business domain?” In our case, there are two points to this:
- Users do not want to see the job listed anymore (unless they explicitly request for a list of old jobs, which I therefore have to keep)
- Some of the job’s data are gone for good and only some basic overview of the job’s status is kept
I’ve read many articles about why soft delete is good and a lot about why it is not (e.g. here). What seems to me as a better alternative is to have the deleted job moved to some archive table. At the same time, I would delete the job’s data which will be no longer needed. The nice consequence of this is that I will not have to adjust all my queries to handle some kind of “DeletedOn” column and the main job table will not be cluttered with inactive jobs.
The problem I have with this is more of a technical one: assuming I still need to keep some references from other entities to the deleted job, what is the best way to do that? Because I have foreign keys to the job table set up in other entities, I can’t just move the job to another table, DB would not let me.
What is the usual and well tested approach to this?
IF I understand you correctly then you have some sort of “Jobs” in the DB and can’t delete all related information but need to keep some part of them there…
There are two options that I use in such cases:
Add a Job state field
This field can have different values like new/in progress/waiting/deliviered/deleted… once you change your code to accomodate for this new field you have a lot of flexibility… you can offer filters based on the Job State for the user in the UI etc.
Add a DeleteOn field and hide it
You rename the table, add the field and create a view with the same name as the original table which filters out all records with DeleteOn set… the view gets a trigger (ON DELETE) which just sets that field for the respective job… no cascading delete, no cluttering/changing code etc. IF need be you can always extend the trigger to move all or part of the rows which have DeleteOn set to archive tables…