I’m building up a new database in SQL Server 2008 for some reporting, and there are many common business rules pertaining to this data that go into different types of reports. Currently these rules are mostly combined in larger procedural programs, in a legacy language, which I’m trying to move over to SQL. I’m shooting for flexibility in implementing reporting from this data, like some reporting in SAS, some in C#, etc.
My approach currently is to break up these common rules (usually VERY simple logic) and encapsulate them in individual SQL UDFs. Performance is not a concern, I just want to use these rules to populate static fields in a sort of reporting “snapshot”, which can then be used to report from in whatever way you want.
I like this modular approach as far as understanding what each rule is doing (and maintaining the rules themselves), but I’m also starting to become a bit afraid that the maintenance may also become a nightmare. Some rules depend on others, but I can’t really get away from that – these things build off each other…which is what I want…I think? 😉
Are there some better approaches for this modular approach in a database? Am I on the right track, or am I thinking of this in too much of a application-development mindset?
SQL is set based, and inherently performs poorly when applying a modular approach.
Functions, Stored Procedures and/or Views – they all abstract the underlying logic. The performance problem comes into play when you use two (or more) functions/etc that utilize the same table(s). It means that two queries are made the the same table(s) when one could’ve been used.
The use of multiple functions says to me that the data model was made to be very “flexible”. To me, that means questionable data typing and overall column/table definition. There’s a need for functions/etc because the database will allow anything to be stored, which means the possibility of bad data is very high. I’d rather put the effort into always having good/valid data, rather than working after the fact to combat existing bad data.
The database is the place to contain this logic. It is faster than application code, and most importantly – centralized to minimize maintainence.