I working a fairly simple database application where I am tracking MLB baseball pitchers. Currently, I have two tables:
- pitchers => name:string, …
- starts => start_date:date, pitch_count:integer, strikes:integer, balls:integer, era:float, run:integer, win:integer
With this configuration it would be fairly simple to develop a report for each pitcher provided I have all of the pitchers historical starts.
My question is what is the best way to handle the two situations below when you don’t have complete historical start information:
-
Situation 1 – You have no historical start detail, but you do have historical summary information: starts, wins, pitches, etc. per year and lifetime. This is the case of a pitcher that has retired.
-
Situation 2 – You have some historical start detail as well as the historical summary information described in Situation 1.
What’s the best way of handling this:
Should I create dummy entries in the starts stable to represent the summary information or should a create a third table that holds the summary information and simply update that table after every start for the active players, or is there some other alternative best practice.
You’ve got 3 possibilities as far as I can see:
have the individual starts
summarised information for individual starts in a separate table
designed to generate the right results when you run a summary query
on the whole starts table
I think the second one is best. Just to clarify it, the idea is that you when you only have summary information you put it in the summary table, but you also put summary information generated from the individual starts you have in this table. It breaks normalisation because some of the summary information is fully dependent on existing starts records, but it will be more efficient particularly if you have to output summarised information often. It requires that you update the summarised information every time you add or change a starts record though. You’d need to make the 2 changes inside a transaction to ensure consistency.