Let’s assume I have a series of functions that work on a sequence, and I want to use them together in the following fashion:
let meanAndStandardDeviation data =
let m = mean data
let sd = standardDeviation data
(m, sd)
The code above is going to enumerate the sequence twice. I am interested in a function that will give the same result but enumerate the sequence only once. This function will be something like this:
magicFunction (mean, standardDeviation) data
where the input is a tuple of functions and a sequence and the ouput is the same with the function above.
Is this possible if the functions mean and stadardDeviation are black boxes and I cannot change their implementation?
If I wrote mean and standardDeviation myself, is there a way to make them work together? Maybe somehow making them keep yielding the input to the next function and hand over the result when they are done?
The only way to do this using just a single iteration when the functions are black boxes is to use the
Seq.cachefunction (which evaluates the sequence once and stores the results in memory) or to convert the sequence to other in-memory representation.When a function takes
seq<T>as an argument, you don’t even have a guarantee that it will evaluate it just once – and usual implementations of standard deviation would first calculate the average and then iterate over the sequence again to calculate the squares of errors.I’m not sure if you can calculate standard deviation with just a single pass. However, it is possible to do that if the functions are expressed using
fold. For example, calculating maximum and average using two passes looks like this:You can do that using a single pass like this:
The lambda function is a bit ugly, but you can define a combinator to compose two functions:
This approach works for functions that can be defined using
fold, which means that they consist of some initial value (Int32.MinValuein the first example) and then some function that is used to update the initial (previous) state when it gets the next value (and then possibly some post-processing of the result). In general, it should be possible to rewrite single-pass functions in this style, but I’m not sure if this can be done for standard deviation. It can be definitely done for mean: