I need help to understand how a function is working;: it is a recursive function with yield return but I can’t figure out how it works. It is used calculate a cumulative density function (approximate) over a set of data.
Thanks a lot to everyone.
/// Approximates the cumulative density through a recursive procedure
/// estimating counts of regions at different resolutions.
/// </summary>
/// <param name="data">Source collection of integer values</param>
/// <param name="maximum">The largest integer in the resulting cdf (it has to be a power of 2...</param>
/// <returns>A list of counts, where entry i is the number of records less than i</returns>
public static IEnumerable<int> FUNCT(IEnumerable<int> data, int max)
{
if (max == 1)
{
yield return data.Count();
}
else
{
var t = data.Where(x => x < max / 2);
var f = data.Where(x => x > max / 2);
foreach (var value in FUNCT(t, max / 2))
yield return value;
var count = t.Count();
f = f.Select(x => x - max / 2);
foreach (var value in FUNCT(f, max / 2))
yield return value + count;
}
}
In essence, IEnumerable functions that use yield return function slightly differently from traditional recursive functions. As a base case, suppose you have:
Ftakes the basic orm of the originalFUNCT. If we call F(2), then walking through the yields:And
1267is printed. Note that theyield returnstatement yields control to the caller, but that the next iteration causes the function to continue where it had previously yielded.The CDF method does adds some additional complexity, but not much. The recursion splits the collection into two pieces, and computes the CDF of each piece, until max=1. Then the function counts the number of elements and yields it, with each yield propogating recursively to the enclosing loop.
To walk through
FUNCT, suppose you run withdata=[0,1,0,1,2,3,2,1]andmax=4. Then running through the method, using the sameMainfunction above as a driver, yields:So this returns the values (2,2,5,5). (using
>=would yield the values (2,5,7,8) — note that these are the exact values of a scaled CDF for non-negative integral data, rather than an approximation).