I’m writing small “hello world” type of program, which groups same files by different “reasons”, e.g. same size, same content, same checksum etc.
So, I’ve got to the point when I want to write a function like this (DuplicateReason is an algebraic type which states the reason why two files are identical):
getDuplicatesByMethods :: (Eq a) => [((FilePath -> a), DuplicateReason)] -> IO [DuplicateGroup]
Where in each tuple, first function would be the one that by file’s path returns you some (Eq a) value, like bytestring (with content), or Word32 with checksum, or Int with size.
Clearly, Haskell doesn’t like that these functions are of different types, so I need to somehow gather them.
The only way I see it to create a type like
data GroupableValue = GroupString String | GroupInt Int | GroupWord32 Word32
And then to make life easier to make typeclass like
class GroupableValueClass a where
toGroupableValue :: a -> GroupableValue
fromGroupableValue :: GroupableValue -> a
and implement instance for each value I’m going to get.
Question: am I doing it right and (if no) is there a simpler way to solve this task?
Update:
Here’s full minimal code that should describe what I want (simplified, with no IO etc.):
data DuplicateGroup = DuplicateGroup
-- method for "same size" -- returns size
m1 :: String -> Int
m1 content = 10
-- method for "same content" -- returns content
m2 :: String -> String
m2 content = "sample content"
groupByMethods :: (Eq a) => [(String -> a)] -> [DuplicateGroup]
groupByMethods predicates = undefined
main :: IO ()
main = do
let groups = (groupByMethods [m1, m2])
return ()
Lists are always homogeneous, so you can’t put items with a different
ain to the same list (as you noticed). There are several ways to design around this, but I usually prefer using GADTs. For example:This solution still needs a new type, but at least you don’t have to specify all cases in advance or create boilerplate type-classes. Now, since the generic type
ais essentially “hidden” inside the GADT, you can define a list that contains functions with different return types, wrapped in theDuplicateTestGADT.You can also solve this without using any language extensions or introducing new types by simply re-thinking your functions. The main intention is to group files according to some property
a, so we could definegetDuplicatesByMethodsasI.e. we take in a function that groups files according to some criteria. Then we can define a helper function
and call
getDuplicatesByMethodslike this