I’m trying to figure out the best way to normalize the following data set, using Access 2010.
I’m building a database to track data entry from a group of workers. They are reading documents that are maps with code numbers for installation jobs all over them as well as associated information for these jobs, and entering these items into the database. There are two main types of code numbers that are gathered from each document. Currently I have all their data writing to a flat table. I have the following fields (* indicates the field can be blank):
- workerID: worker’s ID number
- documentID: unique string associated with each document
- work_date: the date on which worker entered this data
- code1: first type of code number
- *code1Year: installation year associated with code1
- *code1Type: string for code1’s job type
- *code1Loc1: address number for code1 (type 1)
- *code1Loc2: address number for code1 (type 2)
- *code1Loc3: address number for code1 (type 3)
- code2: second type of code number
- groupID: membership group for code (can be used with either code1 or code2)
- code1verified: boolean flag indicating that code1 and its associated info has been verified as correctly transcribed from the document
- code2verified: boolean flag indicating that code2 and its associated info has been verified as correctly transcribed from the document
Here’s my normalization plan:
- tableName: (PRIMARY_KEY),(foreign_key),field1,field2…
- workers: (WORKER_ID)
- groups: (GROUP_ID)
- code1Type: (CODE1TYPE_ID),code1Type
- workDays: (WORKDAY_ID),(worker_id),work_date
- locTypes: (LOC_ID),locationType
- code1: (CODE1_ID),(workday_id),code1,(code1Type_id),(group_id),code1verified
- code1Loc:(CODE1LOC_ID),(code1_id),(loc_id),code1Loc
- code1Year:(CODE1YEAR_ID),(code1_id),code1Year
- code2: (CODE2_ID),(group_id),(workday_id),code2,code2verified
Is this the best way to relate these values to each other?
Without really knowing the data, the best I can say is it needs some work.
I presume that
code1Year,code1Type, andcode1Loc*are not also encoded incode1.I think you are going still need a table to store the row that you’ve read in. It might only be something like this:
I’d consider dropping the separate table for CODE1YEAR, since it should just be an integer, and how often are you going relabel the year 2009 to something else. Likewise with WORKDAY_ID, why not just make it a date field.
It’s not obvious to me why Code1 and Code2 include
group_idandworkday_idAlso
WORKER_IDandGROUP_IDseem kind of bare, unless they are only being used for validation of the IDs