I have been playing with vectors and matrices where the size is encoded in their type, using the new DataKinds extension. It basically goes like this:
data Nat = Zero | Succ Nat
data Vector :: Nat -> * -> * where
VNil :: Vector Zero a
VCons :: a -> Vector n a -> Vector (Succ n) a
Now we want typical instances like Functor and Applicative. Functor is easy:
instance Functor (Vector n) where
fmap f VNil = VNil
fmap f (VCons a v) = VCons (f a) (fmap f v)
But with the Applicative instance there is a problem: We don’t know what type to return in pure. However, we can define the instance inductively on the size of the vector:
instance Applicative (Vector Zero) where
pure = const VNil
VNil <*> VNil = VNil
instance Applicative (Vector n) => Applicative (Vector (Succ n)) where
pure a = VCons a (pure a)
VCons f fv <*> VCons a v = VCons (f a) (fv <*> v)
However, even though this instance applies for all vectors, the type checker doesn’t know this, so we have to carry the Applicative constraint every time we use the instance.
Now, if this applied only to the Applicative instance it wouldn’t be a problem, but it turns out that the trick of recursive instance declarations is essential when programming with types like these. For instance, if we define a matrix as a vector of row vectors using the TypeCompose library,
type Matrix nx ny a = (Vector nx :. Vector ny) a
we have to define a type class and add recursive instance declarations to implement both the transpose and matrix multiplication. This leads to a huge proliferation of constraints we have to carry around every time we use the code, even though the instances actually apply to all vectors and matrices (making the constraints kind of useless).
Is there a way to avoid having to carry around all these constraints? Would it be possible to extend the type checker so that it can detect such inductive constructions?
The definition of
pureis indeed at the heart of the problem. What should its type be, fully quantified and qualified?won’t do, as there is no information available at run-time to determine whether
pureshould emitVNilorVCons. Correspondingly, as things stand, you can’t just haveWhat can you do? Well, working with the Strathclyde Haskell Enhancement, in the Vec.lhs example file, I define a precursor to
purewith a
pitype, requiring that a copy ofnbe passed at runtime. Thispi (n :: Nat).desugars aswhere
Natty, with a more prosaic name in real life, is the singleton GADT given byand the curly braces in the equations for
vecjust translateNatconstructors toNattyconstructors. I then define the following diabolical instance (switching off the default Functor instance)which demands further technology, still. The constraint
{:n :: Nat:}desugars to something which requires that aNatty nwitness exists, and by the power of scoped type variables, the same{:n :: Nat:}subpoenas that witness explicitly. Longhand, that’sand we replace the constraint
{:n :: Nat:}withHasNatty nand the corresponding term with(natty :: Natty n). Doing this construction systematically amounts to writing a fragment of a Haskell typechecker in type class Prolog, which is not my idea of joy so I use a computer.Note that the
Traversableinstance (pardon my idiom brackets and my silent default Functor and Foldable instances) requires no such jiggery pokeryThat’s all the structure you need to get matrix multiplication without further explicit recursion.
TL;DR Use the singleton construction and its associated type class to collapse all of the recursively defined instances into the existence of a runtime witness for the type-level data, from which you can compute by explicit recursion.
What are the design implications?
GHC 7.4 has the type promotion technology but SHE still has the singleton construction
pi-types to offer. One clearly important thing about promoted datatypes is that they’re closed, but that isn’t really showing up cleanly yet: the constructability of singleton witnesses is the manifestation of that closedness. Somehow, if you haveforall (n :: Nat).then it’s always reasonable to demand a singleton as well, but to do so makes a difference to the generated code: whether it’s explicit as in mypiconstruct, or implicit as in the dictionary for{:n :: Nat:}, there is extra runtime information to sling around, and a correspondingly weaker free theorem.An open design question for future versions of GHC is how to manage this distinction between the presence and absence of runtime witnesses to type-level data. On the one hand, we need them in constraints. On the other hand, we need to pattern-match on them. E.g., should
pi (n :: Nat).mean the explicitor the implicit
? Of course, languages like Agda and Coq have both forms, so maybe Haskell should follow suit. There is certainly room to make progress, and we’re working on it!