I’m having trouble modeling a data structure in Haskell. Suppose I’m
running an an animal research facility and I want to keep track of my
rats. I want to track the assignment of the rats to cages and to
experiments. I also want to keep track of the weight of my rats, the
volume of my cages, and keep notes on my experiments.
In SQL, I might do:
create table cages (id integer primary key, volume double);
create table experiments (id integer primary key, notes text)
create table rats (
weight double,
cage_id integer references cages (id),
experiment_id integer references experiments (id)
);
(I realize that this allows me to assign two rats from different
experiments to the same cage. That is intended. I don’t actually run an
animal research facility.)
Two operations that must be possible: (1) given a rat, find the volume of its cage and (2) given a rat, get the notes for the experiment it belongs to.
In SQL, those would be
select cages.volume from rats
inner join cages on cages.id = rats.cage_id
where rats.id = ...; -- (1)
select experiments.notes from rats
inner join experiments on experiments.id = rats.experiment_id
where rats.id = ...; -- (2)
How might I model this data structure in Haskell?
One way to do it is
type Weight = Double
type Volume = Double
data Rat = Rat Cage Experiment Weight
data Cage = Cage Volume
data Experiment = Experiment String
data ResearchFacility = ResearchFacility [Rat]
ratCageVolume :: Rat -> Volume
ratCageVolume (Rat (Cage volume) _ _) = volume
ratExperimentNotes :: Rat -> String
ratExperimentNotes (Rat _ (Experiment notes) _) = notes
But wouldn’t this structure introduce a bunch of copies of the Cages and Experiments? Or should I just not worry about it and hope the optimizer takes care of that?
Here’s a short file I used for testing:
Then I started ghci and imported
System.Vacuum.Cairo, available from the delightfulvacuum-cairopackage.(I’m not really sure why there’s doubled-up arrows in this one, but you can ignore/collapse them.)
The rule of thumb, as should be illustrated above, is that new objects are created exactly when you call a constructor; otherwise, if you just name an already-created object, no new object is created. This is a safe thing to do in Haskell because it is an immutable language.