Here’s the situation:
Let’s say I have a Dog model and a Vaccination model (so, a table storing rows of Dogs and a table storing rows of Vaccinations that was given to a Dog).
So, a Dog has_many Vaccinations and a Vaccinations belongs_to a Dog.
I want to be able to answer quickly the question: “When was the last time Dog A got a Vaccination?” There are two ways to store this data:
1) Normalized database way: let the Vaccine table store everything. To answer the question, search the DB for all Vaccinations given to Dog A, and return the most recent one.
2) Not normalized database way: Have a field in Dog called “last_vaccination”, and maintain this field every time a Vaccination is given to Dog A.
The pros of #1 are: you get database normalization and don’t have to worry about maintaining accurate data.
The pros of #2 are: performance– you don’t have to search the Vaccine database every time.
What’s the right way to do this???
I’m a big fan of a saying I heard from a DB guy at a software symposium years ago:
“Normalize til it hurts, denormalize til it works.”
Lots of truth in that.
FWIW, I think there’s a hole in the layout above – a “Vaccinations” table needs to be there that, in effect, ties the Dog to the Vaccine. Vaccinations != vaccines. That’s a more accurate representation of the normalized version. IMO.