Is there a pattern to combine parallel with a thread safe calculation on the parallel?
Need to calculate a result in which the first step would benefit from parallel and the second is a serial process on the results of the parallel.
One option is to run the parallel and save the output to a collection and then serially process the collection and I have that working. The problem there is memory management as the collection can be very large.
Below is the serial version. Basially I want to parallel the TableQueryGetRowKeys and use that result in a thread safe manner. Tried to just Parallel the for and put a lock around the final results but rowKeys could be off. Tried aggregate but I could not figure out how to pass a collection to the aggregate let alone perform thread safe Intersect in the aggregate.
IEnumerable<string> finalResults = null;
if (partitionKey.Length == 0) return finalResults;
object lockObject = new object();
finalResults = TableQueryGetRowKeys(partitionKey[0], 0);
HashSet<string> rowKeys;
for(int i = 1; i < partitionKey.Length; i++)
{
// IO operation to Azure Table Storage against the PartitionKey
// so very amenable to parallel
rowKeys = TableQueryGetRowKeys(partitionKey[i]);
// a memory and CPU operation
// this should be much faster than TableQueryGetRowKeys
// going parallel and wrapping this in a lock did not properly synch rowKeys
finalResults = finalResults.Intersect(rowKeys);
}
return finalResults;
Assuming that
TableQueryGetRowKeysis thread safe:In stepwise fashion this algorithm works like so:
partitionKey.AsParallel()turns the regularIEnumerable<string>into aParallelQuery<string>which allows parallel processing of the sequence.ParallelEnumerable.Selectis used to callTableQueryGetRowKeysin parallel.TableQueryGetRowKeysis then wrapped in aParallelQuery<T>usingAsParallel().ParallelEnumerable.Intersectis used as an aggregation function over each “parallel-enabled” enumeration returned byTableQueryGetRowKeys.In effect, this could be used in serial to replace your previous code by removing the
AsParallelcalls, like so:You can “convince” yourself that this is equivalent to your method when you look at the meat and potatoes of your implementation:
Rewriting the above using
+instead ofIntersect:Now it becomes clear that
Intersectcould be used in place of other more “common” aggregation functions (e.g. mathematical operators).