Is it bad to duplicate names and prices into order_lines tables (references from product and options tables)?
I’ve have checked a few popular ecommerce open sources PHP scripts and it does it.
Assume the following tables (quick example):
product table:
+------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+--------------+------+-----+---------+----------------+
| product_id | int(11) | NO | PRI | NULL | auto_increment |
| name | varchar(150) | NO | | NULL | |
+------------+--------------+------+-----+---------+----------------+
options table: (a product can have 1 or more options, eg: small, large, x-large, etc)
+------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+--------------+------+-----+---------+----------------+
| option_id | int(11) | NO | PRI | NULL | auto_increment |
| product_id | int(11) | NO | | NULL | |
| name | varchar(150) | NO | | NULL | |
| price | decimal(6,2) | NO | | NULL | |
+------------+--------------+------+-----+---------+----------------+
Company will be getting about 5000 new orders daily, I am looking for a reasonable way how to design order, order_line tables? Do you duplicate names and prices into order_lines table? Hundreds of prices will be changed every few month from the options table.
I have read about versioning (Type 2), im not sure how it actually work, from what I can understand I can add version_id field in the product, options and order_line tables. Whatever the MAX version_id is, its mean the latest version. It seem much easier than using StartDate and EndDate design.
I am looking for the design methodology that can be done quick and reasonable. Not too complicated design.
Your
optionstable is not large (in row size, not number of rows) so storing the name multiple times shouldn’t be a problem. However, if you want to ensure the same string is used for all "Large" options then extracting the strings to a lookup table will help.As a side note, you may want to reconsider your primary key for this table as using an auto increment field allows for a product to have the same option applied more than once.
Having a lookup table for the names would require versioning as changes to the table have an effect on historical rows. You can use version numbers or date ranges, whichever is easier for you, although the use of dates does also let you know when the change to an option occurred.
Dates may be a little easier to use as you can use triggers to update the table, writing the
CURRENT_TIMESTAMPin to the table without needing to know the previous version number. Using version number requires a loop up before update.You might find it useful to have a look at "Developing Time Oriented Databases" by Richard Snodgrass, available free h=from here: http://www.cs.arizona.edu/~rts/publications.html
EDIT: A table with version information usually has a date/time field holding the ‘valid_from’ date for that row. New rows will have this automatically filled with ‘CURRENT_TIMESTAMP’ so you know which is the most recent row. Other methods use two fields to record the start and end time when the row was valid. Using the two fields makes the queries easier as you can do ‘SELECT … WHERE point_in_time BETWEEN start_date AND end_date’