I’m implementing Bagwell’s Ideal Hash Trie in Haskell. To find an element in a sub-trie, he says to do the following:
Finding the arc for a symbol s,
requires finding its corresponding bit
in the bit map and then counting the
one bits below it in the map to
compute an index into the ordered
sub-trie.
What is the best way to do this? It sounds like the most straightforward way of doing this is to select the bits below that bit and do a population count on the resulting number. Is there a faster or better way to do this?
Yes. In particular, mask and popcount can be made to be quite efficient. Here is what the Clojure implementation does:
…
…
Here is what I did in my implementation in Haskell:
type Key = Word type Bitmap = Word type Shift = Int type Subkey = Int -- we need to use this to do shifts, so an Int it is -- These architecture dependent constants bitsPerSubkey :: Int bitsPerSubkey = floor . logBase 2 . fromIntegral . bitSize $ (undefined :: Word) subkeyMask :: Bitmap subkeyMask = 1 `shiftL` bitsPerSubkey - 1 maskIndex :: Bitmap -> Bitmap -> Int maskIndex b m = popCount (b .&. (m - 1)) mask :: Key -> Shift -> Bitmap mask k s = shiftL 1 (subkey k s) {-# INLINE subkey #-} subkey :: Key -> Shift -> Int subkey k s = fromIntegral $ shiftR k s .&. subkeyMask