I’m developing a multi-user application which uses a (postgresql-)database to store its data. I wonder how much logic I should shift into the database?
e.g. When a user is going to save some data he just entered. Should the application just send the data to the database and the database decides if the data is valid? Or should the application be the smart part in the line and check if the data is OK?
In the last (commercial) project I worked on, the database was very dump. No constraits, no views etc, everything was ruled by the application. I think that’s very bad, because every time a certain table was accesed in the code, there was the same code to check if the access is valid repeated over and over again.
By shifting the logic into the database (with functions, trigers and constraints), I think we can save a lot of code in the application (and a lot of potential errors). But I’m afraid of putting to much of the business-logic into the database will be a boomerang and someday it will be impossible to maintain.
Are there some real-life-approved guidelines to follow?
If you don’t need massive distributed scalability (think companies with as much traffic as Amazon or Facebook etc.) then the relational database model is probably going to be sufficient for your performance needs. In which case, using a relational model with primary keys, foreign keys, constraints plus transactions makes it much easier to maintain data integrity, and reduces the amount of reconciliation that needs to be done (and trust me, as soon as you stop using any of these things, you will need reconciliation — even with them you likely will due to bugs).
However, most validation code is much easier to write in languages like C#, Java, Python etc. than it is in languages like SQL because that’s the type of thing they’re designed for. This includes things like validating the formats of strings, dependencies between fields, etc. So I’d tend to do that in ‘normal’ code rather than the database.
Which means that the pragmatic solution (and certainly the one we use) is to write the code where it makes sense. Let the database handle data integrity because that’s what it’s good at, and let the ‘normal’ code handle data validity because that’s what it’s good at. You’ll find a whole load of cases where this doesn’t hold true, and where it makes sense to do things in different places, so just be pragmatic and weigh it up on a case by case basis.