This is my input data:
[[:a 1 2] [:a 3 4] [:a 5 6] [:b \a \b] [:b \c \d] [:b \e \f]]
I would like to map this into the following:
{:a [[1 2] [3 4] [5 6]] :b [[\a \b] [\c \d] [\e \f]]}
This is what I have so far:
(defn- build-annotation-map [annotation & m]
(let [gff (first annotation)
remaining (rest annotation)
seqname (first gff)
current {seqname [(nth gff 3) (nth gff 4)]}]
(if (not (seq remaining))
m
(let [new-m (merge-maps current m)]
(apply build-annotation-map remaining new-m)))))
(defn- merge-maps [m & ms]
(apply merge-with conj
(when (first ms)
(reduce conj ;this is to avoid [1 2 [3 4 ... etc.
(map (fn [k] {k []}) (keys m))))
m ms))
The above produces:
{:a [[1 2] [[3 4] [5 6]]] :b [[\a \b] [[\c \d] [\e \f]]]}
It seems clear to me that the problem is in merge-maps, specifically with the function passed to merge-with (conj), but after banging my head for a while now, I’m about ready for someone to help me out.
I’m new to lisp in general, and clojure in particular, so I also appreciate comments not specifically addressing the problem, but also style, brain-dead constructs on my part, etc. Thanks!
Solution (close enough, anyway):
(group-by first [[:a 1 2] [:a 3 4] [:a 5 6] [:b \a \b] [:b \c \d] [:b \e \f]])
=> {:a [[:a 1 2] [:a 3 4] [:a 5 6]], :b [[:b \a \b] [:b \c \d] [:b \e \f]]}
Concerning your code, the most significant problem is naming. Firstly, I wouldn’t, especially without first understanding your code, have any idea what is meant by
annotation,gff, andseqname.currentis pretty ambiguous too. In Clojure,remainingwould generally be calledmore, depending on the context, and whether a more specific name should be used.Within your let statement,
gff (first annotation), I’d probably take advantage of destructuring, like this:remaining (rest annotation)
(let [[first & more] annotation] ...)If you would rather use
(rest annotation)then I’d suggest usingnextinstead, as it will returnnilif it’s empty, and allow you to write(if-not remaining ...)rather than(if-not (seq remaining) ...).In Clojure, unlike other lisps, the empty list is truthy.
This article shows the standard for idiomatic naming.