Recently, blog entries such as Computing the Size of a Hashmap explained how to reason about space complexities of commonly used container types. Now I’m facing the question of how to actually “see” which memory layout my GHC version chooses (depending on compile flags and target architecture) for weird data types (constructors) such as
data BitVec257 = BitVec257 {-# UNPACK #-} !Word64
{-# UNPACK #-} !Word64
{-# UNPACK #-} !Bool
{-# UNPACK #-} !Word64
{-# UNPACK #-} !Word64
data BitVec514 = BitVec514 {-# UNPACK #-} !BitVec257
{-# UNPACK #-} !BitVec257
In C there’s the sizeof and offsetof operator, which allows me to “see” what size and alignment was chosen for the fields of C struct.
I’ve tried to look at GHC Core in the hope to find some hint there, but I didn’t know what to look for. Can somebody point me in the right direction?
My first idea was to use this neat litte function, due to Simon Marlow:
Using it:
(Note that GHC is telling you that it cannot unbox
Boolsince it’s a sum type.)The above function claims that your data type uses 74 bytes on a 64-bit machine. I find that hard to believe. I’d expect the data type to use 11 words = 88 bytes, one word per field. Even
Bools take one word, as they are pointer to (statically allocated) constructors. I’m not quite sure what’s going on here.As for alignment I believe every field should be word aligned.