Okay, I am asked to prepare a university database and I am required to store certain data in certain way.
For example, I need to store a course code that has a letter and followed by two integers. eg. I45,D61,etc.
So it should be VARCHAR(3) am I right? But I am still unsure whether this is the right path for it. I am also unsure how I am going to enforce this in the SQL script too.
I can’t seem to find any answer for it in my notes and I am currently writing the data dictionary for this question before I meddle into the script.
Any tips?
As much as possible, make primary key with no business meaning. You can easily change your database design without dearly affecting the application layer side. With dumb primary key, the users don’t associate meaning to the identifier of a certain record.
What you are inquiring about is termed as intelligent key, which most often is user-visible. The non user-visible keys is called dumb or surrogate key, sometimes this non user-visible key become visible, but it’s not a problem as most dumb key aren’t interpreted by the user. An example, however you want to change the title of this question, the id of this question will remain the same https://stackoverflow.com/questions/10412621/
With intelligent primary key, sometimes for aesthetic reasons, users want to dictate how the key should be formatted and look like. And this could get easily get updated often as often as users feel. And that will be a problem on application side, as this entails cascading the changes on related tables; and the database side too, as cascaded updating of keys on related tables is time-consuming
Read details here:
http://www.bcarter.com/intsurr1.htm
Advantages of surrogate keys: http://en.wikipedia.org/wiki/Surrogate_key
You can implement natural keys(aka intelligent key) alongside the surrogate key(aka dumb key)
The advantage of that approach is when some point in the future the school expand, then they decided to offer an Spanish language-catered Database Structure, your database is insulated from the user-interpreted values that are introduced by the user.
Let’s say your database started using intelligent key :
Then came the Spanish language-catered Database Structure course. If the user introduce their own rules to your system, they might be tempted to input this on course_code value:
D61/ESP, others will do it as ESP-D61, ESP:D61. Things could get out of control if the user decided their own rules on primary keys, then later they will tell you to query the data based on the arbitrary rules they created on the format of the primary key, e.g. “List me all the Spanish language courses we offer in this school”, epic requirement isn’t it? So what’s a good developer will do to fit those changes to the database design? He/she will formalize the data structure, one will re-design the table to this:
Did you see the problem with that? That shall incur downtime, as you needed to propagate the changes to the foreign keys of the table(s) that depends on that course table. Which of course you also need first to adjust those dependent tables. See the trouble it could cause not only for the DBA, and also for the dev too.
If you started with dumb primary key from the get-go, even if the user introduce rules to the system without your knowing, this won’t entail any massive data changes nor data schema changes to your database design. And this can buy you time to adjust your application accordingly. Whereas if you put intelligence in your primary key, user requirement such as above can make your primary key devolve naturally to composite primary key. And that is hard not only on database design re-structuring and massive updating of data, it will be also hard for you to quickly adapt your application to the new database design.
So with surrogate key, even if users stash new rules or information to the course_code, you can safely introduce changes to your table without compelling you to quickly adapt your application to the new design. Your application can still continue and won’t necessitate downtime. It can really buy you time to adjust your app accordingly, anytime. This would be the changes to the language-specific courses:
As you can see, you can still perform a massive
UPDATEstatement to split the user-imposed rules on course_code to two fields which doesn’t necessitate changes on the dependent tables. If you use intelligent composite primary key, restructuring your data will compel you to cascade the changes on composite primary keys to dependent tables’ composite foreign keys. With dumb primary key, your application shall still operate as usual, you can amend changes to your app based on the new design (e.g. new textbox, for course language) later on, any time. With dumb primary key, the dependent table doesn’t need a composite foreign key to point to the course table, they can still use the same old dumb/surrogate primary keyAnd also with dumb primary key, the size of your primary key and foreign keys won’t expand