I’m torn between two implementations of a certain data structure, and input from the Haskell community as to what is right/standard would be appreciated.
Data Types
Take, for example, a ADT “Server” which defines several servers as nullary data constructors.
data Server = Server1
| Server2
| Server3
Now, for each of these servers I want to have (among other things) the ability to get an IP address. Assuming I can code these statically, I can have some function “getURL” and pattern match.
getUrl :: Server -> String
getUrl Server1 = "192.168.1.1"
and etc. Now any function which uses servers can put Server in the type and call getURL.
serverStuff :: Server -> IO ()
This method seems to have the benefit of simple, non-polymorphic functions at the expense of having lots of pattern matching in getURL. Additionally, if the programmer adds a Server but forgets to add the pattern to getURL, they will get a runtime error without warning unless they compile with -Wall.
Typeclasses
Attacking the same problem with typeclasses, I can break out my multi-constructor ADT into a set of ADTs specific to the server and create a type class for URL.
data Server1 = Server1
data Server2 = Server2
data Server3 = Server3
class Server a where
getUrl :: a -> String
instance Server Server1 where
getUrl Server1 = "192.168.1.1"
and etc. Now instead of the simple non-polymorphic function I used before, I have to create something like
serverStuff :: Server a => a -> IO ()
and deal with the implications of ad-hoc polymorphism (function specialization and the like).
On the bright side, the typeclass method easy to expand, breaks up the pattern matching into smaller chunks, allows for greater abstraction e.g. grouped servers (data ServerCenter1 = Server1 | Server2 | Server3), and, while you can still get runtime errors (without compiler warning) if you don’t declare getUrl, you’re at least forced to make that decision when you create the instance.
So, I’m torn but leaning toward instances as a better way of doing things. Is there a standard way to handle this issue, or is it a “whatever seems clean” type of thing?
If you are positive that the only information you need your server type to contain, I would just implement them as a newtype around a string:
Making it a full record (as in hammar’s comment) would allow you to add information while changing only the constructors, at the expense of GeneralizedNewtypeDeriving.
In general, I would use types to represent classes of things and variables to represent particulars, so that nullary constructors are used only to represent abstracts, e.g.
data Status = Published | Draft(or the builtin Bool). Hardcoding data (such as ip addresses) into the type system or functions should be avoided, unless there is a specific reason.If you want server specific behaviors, it is easy to add fields to the record:
However, I would advise against doing this, because it makes other code obscure:
can do absolutely anything, and you need to locate the last update of that field to determine what (since functions have no
Showinstance). If the dependency depends on some property of the server, I would favor encoding that property and then dispatching on that, e.g.I consider this approach superior to hardcoding server names into the function, because it explains why a given server has a given behavior (and makes changing the behavior of a particular server more local), and to having functions in the record fields, as it makes it easier to tell what a given invocation of runSomething will do (as one can inspect and log the ServerType).