I’ve written some code to try and describe my concern:
static void Main(string[] args)
{
IEnumerable<decimal> marks = GetClassMarks();
IEnumerable<Person> students = GetStudents();
students.AsParallel().ForAll(p => GenerateClassReport(p, marks));
Console.ReadKey();
}
GetClassMarks uses yield return in it from my weird data source. Assume that GenerateClassReport does basically a marks.Sum()/marks.Count() to get the class average.
From what I understand, students.AsParallel().ForAll is a parallel foreach.
My worry is what is going to happen inside the GetClassMarks method.
- Is it going to be enumerated once or many times?
- What order is the enumeration going to happen in?
- Do I need to do a .ToList() on marks to make sure it is only hit once?
Assuming that
GenerateClassReport()enumeratesmarksonce, thenmarkswill be enumerated once for each element instudents.Each thread will enumerate the collection in its default order, but several threads will do so concurrently. The concurrent enumeration order is generally unpredictable. Also, you should note that the number of threads is limited and variable, so most likely not all of the enumerations will occur concurrently.
If
GetClassMarks()is an iterator (i.e. it uses theyieldconstruct), then its execution will be deferred and it will be called once for each timemarksis enumerated (i.e. once for each element instudents). If you useIEnumerable<decimal> marks = GetClassMarks().ToList()or ifGetClassMarks()internally returns a concrete list or array, thenGetClassMarks()will be executed immediately and the results will be stored and enumerated in each of the parallel threads without callingGetClassMarks()again.