Taking an example of Fibonacci Series from the Clojure Wiki, the Clojure code is :
(def fib-seq
(lazy-cat [0 1] (map + (rest fib-seq) fib-seq)))
If you were to think about this starting from the [0 1], how does it work ? Would be great if there are suggestions on the thought process that goes into thinking in these terms.
As you noted, the
[0 1]establishes the base cases: The first two values in the sequence are zero, then one. After that, each value is to be the sum of the previous value and the value before that one. Hence, we can’t even compute the third value in the sequence without having at least two that come before it. That’s why we need two values with which to start off.Now look at the
mapform. It says to take the head items from two different sequences, combine them with the+function (adding multiple values to produce one sum), and expose the result as the next value in a sequence. Themapform is zipping together two sequences — presumably of equal length — into one sequence of the same length.The two sequences fed to
mapare different views of the same basic sequence, shifted by one element. The first sequence is “all but the first value of the base sequence”. The second sequence is the base sequence itself, which, of course, includes the first value. But what should the base sequence be?The definition above said that each new element is the sum of the previous (Z – 1) and the predecessor to the previous element (Z – 2). That means that extending the sequence of values requires access to the previously computed values in the same sequence. We definitely need a two-element shift register, but we can also request access to our previous results instead. That’s what the recursive reference to the sequence called
fib-seqdoes here. The symbolfib-seqrefers to a sequence that’s a concatenation of zero, one, and then the sum of its own Z – 2 and Z – 1 values.Taking the sequence called
fib-seq, drawing the first item yields the first element of the[0 1]vector — zero. Drawing the second item yields the second element of the vector — one. Upon drawing the third item, we consult themapto generate a sequence and use that as the remaining values. The sequence generated bymaphere starts out with the sum of the first item of “the rest of”[0 1], which is one, and the first item of[0 1], which is zero. That sum is one.Drawing the fourth item consults
mapagain, which now must compute the sum of the second item of “the rest of” the base sequence, which is the one generated bymap, and the second item of the base sequence, which is the one from the vector[0 1]. That sum is two.Drawing the fifth item consults
map, summing the third item of “the rest of” the base sequence — again, the one resulting from summing zero and one — and the third item of the base sequence — which we just found to be two.You can see how this is building up to match the intended definition for the series. What’s harder to see is whether drawing each item is recomputing all the preceding values twice — once for each sequence examined by
map. It turns out there’s no such repetition here.To confirm this, augment the definition of
fib-seqlike this to instrument the use of function+:Now ask for the first ten items:
Notice that there are eight calls to
+to generate the first ten values.Since writing the preceding discussion, I’ve spent some time studying the implementation of lazy sequences in Clojure — in particular, the file LazySeq.java — and thought this would be a good place to share a few observations.
First, note that many of the lazy sequence processing functions in Clojure eventually use
lazy-seqover some other collection.lazy-seqcreates an instance of the Java typeLazySeq, which models a small state machine. It has several constructors that allow it to start in different states, but the most interesting case is the one that starts with just a reference to a nullary function. Constructed that way, theLazySeqhas neither evaluated the function nor found a delegate sequence (typeISeqin Java). The first time one asks theLazySeqfor its first element — viafirst— or any successors — vianextorrest— it evaluates the function, digs down through the resulting object to peel away any wrapping layers of otherLazySeqinstances, and finally feeds the innermost object through the java functionRT#seq(), which results in anISeqinstance.At this point, the
LazySeqhas anISeqto which to delegate calls on behalf offirst,next, andrest. Usually the “head”ISeqwill be of typeCons, which stores a constant value in its “first” (or “car”) slot and anotherISeqin its “rest” (or “cdr”) slot. ThatISeqin its “rest” slot can in turn be aLazySeq, in which case accessing it will again require this same evaluation of a function, peeling away any lazy wrappers on the return value, and passing that value throughRT#seq()to yield anotherISeqto which to delegate.The
LazySeqinstances remain linked together, but having forced one (throughfirst,next, orrest) causes it to delegate straight through to some non-lazyISeqthereafter. Usually that forcing evaluates a function that yields aConsbound to first value and its tail bound to anotherLazySeq; it’s a chain of generator functions that each yield one value (theCons‘s “first” slot) linked to another opportunity to yield more values (aLazySeqin theCons‘s “rest” slot).Tying this back, in the Fibonacci Sequence example above,
mapwill take each of the nested references to tofib-seqand walk them separately via repeated calls torest. Each such call will transform at most oneLazySeqholding an unevaluated function into aLazySeqpointing to something like aCons. Once transformed, any subsequent accesses will quickly resolve to theConses — where the actual values are stored. When one branch of themapzipping walksfib-seqone element behind the other, the values have already been resolved and are available for constant-time access, with no further evaluation of the generator function required.Here are some diagrams to help visualize this interpretation of the code:
As
mapprogresses, it hops fromLazySeqtoLazySeq(and henceConstoCons), and the rightmost edge only expands the first time one callsfirst,next, orreston a givenLazySeq.