I always tend to “protect” my persistance layer from violations via the service layer. However, I am beginning to wonder if it’s really necessary. What’s the point in taking the time in making my database robust, building relationships & data integrity when it never actually comes into play.
For example, consider a User table with a unique contraint on the Email field. I would naturally want to write blocker code in my service layer to ensure the email being added isn’t already in the database before attempting to add anything. In the past I have never really seen anything wrong with it, however, as I have been exposed to more & more best practises/design principles I feel that this approach isn’t very DRY.
So, is it correct to always ensure data going to the persistance layer is indeed “valid” or is it more natural to let the invalid data get to the database and handle the error?
Even though there isn’t a conclusive answer, I think it’s a great question.
First, I am a big proponent of including at least basic validation in the database and letting the database do what it is good at. At minimum, this means foreign keys,
NOT NULLwhere appropriate, strongly typed fields wherever possible (e.g. don’t put a text field where an integer belongs), unique constraints, etc. Letting the database handle concurrency is also paramount (as @Branko Dimitrijevic pointed out) and transaction atomicity should be owned by the database.If this is moderately redundant, than so be it. Better too much validation than too little.
However, I am of the opinion that the business tier should be aware of the validation it is performing even if the logic lives in the database.
It may be easier to distinguish between exceptions and validation errors. In most languages, a failed data operation will probably manifest as some sort of exception. Most people (me included) are of the opinion that it is bad to use exceptions for regular program flow, and I would argue that email validation failure (for example) is not an “exceptional” case.
Taking it to a more ridiculous level, imagine hitting the database just to determine if a user had filled out all required fields on a form.
In other words, I’d rather call a method
IsEmailValid()and receive a boolean than try to have to determine if the database error which was thrown meant that the email was already in use by someone else.This approach may also perform better, and avoid annoyances like skipped IDs because an INSERT failed (speaking from a SQL Server perspective).
The logic for validating the email might very well live in a reusable stored procedure if it is more complicated than simply a unique constraint.
And ultimately, that simple unique constraint provides final protection in case the business tier makes a mistake.
Some validation simply doesn’t need to make a database call to succeed, even though the database could easily handle it.
Some validation is more complicated than can be expressed using database constructs/functions alone.
Business rules across applications may differ even against the same (completely valid) data.
Some validation is so critical or expensive that it should happen prior to data access.
Some simple constraints like field type/length can be automated (anything run through an ORM probably has some level of automation available).