I have noticed that when designing a database I tend to shift any repeating sets of data into a separate table. For example, say I had a table of people, with each person living in a state. I would then move these repeating states into a separate table and reference them with foreign keys.
However, what if I was not storing any more data about states. I would then have a table with StateID and State in. Is this action correct? State is dependant on the primary key of the users table, so does shifting it into its own table help with anything?
Thanks,
I believe that removing subsets of repeating data within a table and placing them in tables of their own is called for in the process of placing a table in Second Normal Form.
Moving the state abbreviation into a table of its own is how you would normalize your database. It protects your “user” table from update anomalies where let’s say for some reason the abbreviation “KY” for Kentucky is updated to “KQ”. By placing a foreign key in the user table that contains the primary key of the states table you only have to make one update to the states table to correct this entry for all of your users.
That being said, it seems quite obvious to us that states abbreviations do not change often. So if you know for a fact that your database will never need to store more information about a state then it is logical and fundamentally sound to leave the state field in the user table. De-normalization of such is common. It will increase the readability of the data in your user table, and reduce the overhead of doing the join. It is however preference.