I have a situation that involves Companies, Projects, and Employees who write Reports on Projects.
A Company owns many projects, many reports, and many employees.
One report is written by one employee for one of the company’s projects.
Companies each want different things in a report. Let’s say one company wants to know about project performance and speed, while another wants to know about cost-effectiveness. There are 5-15 criteria, set differently by each company, which ALL apply to all of that company’s project reports.
I was thinking about different ways to do this, but my current stalemate is this:
- To company table, add text field
criteria, which contains an array of the criteria desired in order. - In the report table, have a
company_idand columnscriterion1,criterion2, etc.
I am completely aware that this is typically considered horrible database design – inelegant and inflexible. So, I need your help! How can I build this better?
Conclusion
I decided to go with the serialized option in my case, for these reasons:
- My requirements for the criteria are simple – no searching or sorting will be required of the reports once they are submitted by each employee.
- I wanted to minimize database load – where these are going to be implemented, there is already a large page with overhead.
- I want to avoid complicating my database structure for what I believe is a relatively simple need.
- CouchDB and Mongo are not currently in my repertoire so I’ll save them for a more needy day.
This would be a great opportunity to use NoSQL! Seems like the textbook use-case to me. So head over to CouchDB or Mongo and start hacking.
With conventional DBs you are slightly caught in the problem of how much to normalize your data:
A sort of “good” way (meaning very normalized) would look something like this:
This makes something very simple and fast in NoSQL a triple or quadruple join in SQL and you have many models that pretty much do nothing.
Another way is to denormalize:
Related to that is the rather clever way of serializing at least the criteria (and maybe values if they were all boolean) is using bit fields. This basically gives you more or less easy migrations (hard to delete and modify, but easy to add) and search-ability without any overhead.
A good plugin that implements this is Flag Shih Tzu which I’ve used on a few projects and could recommend.
Variable columns (eg.
crit1,crit2, etc.).I’d strongly advise against it. You don’t get much benefit (it’s still not very searchable since you don’t know in which column your info is) and it leads to maintainability nightmares. Imagine your db gets to a few million records and suddenly someone needs 16 criteria. What could have been a complete no-issue is suddenly a migration that adds a completely useless field to millions of records.
Another problem is that a lot of the ActiveRecord magic doesn’t work with this – you’ll have to figure out what
crit1means by yourself – now if you wan’t to add validations on these fields then that adds a lot of pointless work.So to summarize: Have a look at Mongo or CouchDB and if that seems impractical, go ahead and save your stuff serialized. If you need to do complex validation and don’t care too much about DB load then normalize away and take option 1.