I am trying to parallelize a query with a groupby statement in it. The query is similar to
var colletionByWeek = (
from item in objectCollection
group item by item.WeekStartDate into weekGroups
select weekGroups
).ToList();
If I use Parallel.ForEach with shared variable like below, it works fine. But I don’t want to use shared variables in parallel query.
var pSummary=new List<object>();
Parallel.ForEach(colletionByWeek, week =>
{
pSummary.Add(new object()
{
p1 = week.First().someprop,
p2= week.key,
.....
});
}
);
So, I have changed the above parallel statement to use local variables. But the compiler complains about the source type <IEnumerable<IGrouping<DateTime, object>> can not be converted into System.Collections.Concurrent.OrderablePartitioner<IEnumerable<IGrouping<DateTime, object>>.
Am I giving a wrong source type? or is this type IGouping type handled differently? Any help would be appreciated. Thanks!
Parallel.ForEach<IEnumerable<IGrouping<DateTime, object>>, IEnumerable<object>>
(spotColletionByWeek,
() => new List<object>(),
(week, loop, summary) =>
{
summary.Add(new object()
{
p1 = week.First().someprop,
p2= week.key,
.....
});
return new List<object>();
},
(finalResult) => pSummary.AddRange(finalResult)
);
The type parameter
TSourceis the element type, not the collection type. And the second type parameter represents the local storage type, so it should beList<T>, if you want toAdd()to it. This should work:That’s assuming you don’t actually have
objects there, but some specific type.Although explicit type parameters are not even necessary here. The compiler should be able to infer them.
But there are other problems in the code:
you shouldn’t return new
Listfrom the main delegate, butsummarythe delegate that processes
finalResultmight be executed concurrently on multiple threads, so you should use locks or a concurrent collection there.