I am designing a laboratory information system (LIS) and am confused on how to design the tables for the different laboratory tests. How should I deal with a table that has an attribute with multiple values and each of the multiple values of that attribute can also have multiple values as well?
Here’s some of the data in my LIS design…
HEMATOLOGY <-------- Lab group
**************************************************************
CBC <-------- Sub group 1
RBC <-------- Component
WBC
Hemoglobin
Hematocrit
MCV
MCH
MCHC
Platelet count
Hemoglobin
Hematocrit
WBC differential
Neutrophils
Lymphocytes
Monocytes
Eosinophils
Basophils
Platelet count
Reticulocyte count
ESR
Bleeding time
Clotting time
Pro-time
Peripheral smear
Malarial smear
ABO
RH typing
CLINICAL MICROSCOPY <-------- Lab Group
**************************************************************
Routine urinalysis <-------- Sub group 1
Visual Examination <-------- Sub group 2
Color <-------- Component
Turbidity
Specific Gravity
Chemical Examination
pH
protein
glucose
ketones
RBC
Hbg
bilirubin
specific gravitiy
nitrite for bacteria
urobilinogen
leukocyte esterase
Microscopic Examination
Red Blood Cells (RBCs)
White Blood Cells (WBCs)
Epithelial Cells
Microorganisms (bacteria, trichomonads, yeast)
Trichomonads
Casts
Crystals
Occult Blood
Pregnancy Test
…This hierarchy of data also gets repeated in other lab groupings in my design (e.g. Blood chemistry, Serology, etc)…
Another question is, how am I gonna deal with a component (for example, RBC) which can be a member of one or more lab groups?
I already implemented a solution to my problem by making a separate tables, 1 for lab group, 1 for sub group 1, 1 for sub group 2 and 1 for component. And then created another table to consolidate all of them by placing a foreign key of each in this table…the only trade off is that some of the rows in this table may have null values. Im not satisfied with my design, so I’m hoping someone could give me advise on how to make it right; any help would be greatly appreciated.
Here are a couple options:
If it is just the hierarchy above you are modeling, and there is no other data involved, then you can do it in two tables:
One problem with this is that you do not enforce that, for example, a
sub_groupmust be a child of alab_group, or that acomponentmust be child of either asub_group_1or asub_group_2, but you could enforce these requirements in your application tier instead.The plus side of this approach is that the schema is nice and simple. Even if the entities have more data associated with them, it might still be worth modeling the hierarchy like this and have some separate tables for the entities themselves.
If you want to enforce the correct relationships at the data level, then you are going to have to split it out into separate tables. Maybe something like this:
This assumes that each
sub_group_1is only related to a singlelab_group. If this is not the case then add a link table betweenlab_groupandsub_group_1. Likewise for thesub_group_1->sub_group_2relationship.There is a single link table between
componentandsub_group_1andsub_group_2. This allows a singlecomponentto be related to severalsub_group_1andsub_group_2entities. The fact it is a single table means that a lot of thesub_group_1_idandsub_group_2_idrecords will benull(like you mentioned in your question). You could prevent the nulls be having two separate link tables:sub_group_1_componentwith a foreign key tosub_group_1and a foreign key tocomponentsub_group_2_componentwith a foreign key tosub_group_2and a foreign key tocomponentThe reason I didn’t put this in the diagram is that for me, having to query two tables rather than one to get all the
component->sub_grouprelationships is too much of a pain. For the sake of a little denormalisation (allowing a fewnulls) it is much easier to query a single table. If you find yourself allowing a lot ofnulls (like a single link table for the relationships between all the entities here) then that is probably denormalising too much.