Let’s imagine I have a relational database in which, among other things, I want to store employee names and their identification strings. The format of the identification string is strictly defined: it’s three upper-case alphabetical characters, followed by a dash, followed by a four-digit number.
Question: does any relational database allow to define a regular expression that a particular textual field must conform to? Like in my example, it would be nice to make the database check all the values of employee ids against a simple regex, rather than do it on the UI level.
Another question: if I’m having problems like this (i.e. necessity to validate field values against an additional set of constraints), does it mean that my schema is denormalized and I should fix it?
With regards to your second question, it depends. (Of course it depends. It always depends.) If you always use your Employee Identification strings as a single “whole” value, then it’s normalized. If you find that you are constantly breaking them into the “first and second” parts (3 characters, 4 digits), then you are breaking first normal form. (Roughly, you have two facts in one column, and should split them into their own columns.)
Assuming proper normalization, to my mind the fact that you have to rely on the database to ensure the data is in proper form raises questions about the integrity of your data sources. Why is the data not checked, cleansed, and put in proper formed before it is passed to the database? RDBMSes are really good at storing, sorting, and retrieving data, but they’re not so hot at processing complex algorithms. it’s just not what they’re for. You can do it in the database, yes, but there are better ways to do it.