I have a business case in which, for each record in a given source table, a number of changes in different tables need to be made, and each of these sourceTable records need to be processed in isolation.
So I have the following pseodocode:
MyEntityFrameworkContext ctx;
foreach (sourceRecord sr in ctx.sourceTable)
{
try
{
using (MyEntityFrameworkContext tctx = new MyEntityFramworkContext)
{
string result1 = MakeUpdatesToSomeOtherTable1(tctx);
sr.Result1 = result1;
string result2 = MakeUpdatesToSomeOtherTable2(tctx);
sr.Result2 = result2;
// will be more tables here.
using (TransactionScope ts = new TransactionScope)
{
tctx.SaveChanges; // to save changes made to OtherTable1 and OtherTable2
tctx.ExecuteStoreCommand("SQL that makes a few other changes related to sourceRecord to tables that are NOT in the EF context");
ts.Complete();
}
}
}
catch (Exception ex)
{
sr.ExceptionResult = ex.Message();
}
});
ctx.SaveChanges(); // to save all changes made to sourceTable.
The reason for the tctx and TransactionScope within the loop is that I need the changes to the OtherTables1&2 to be saved in one transaction with the call to the tctx.ExecuteStoreCommand() for each sourceRecord being processed.
Note also that I need the results written to sourceTable to be saved independently of the changes made to the tables updated in the TransactionScope. Hence I can’t include the update of sourceTable in the same TransactionScope, since if that txn rolls back, I won’t have a record of the exception. This way, at the end of the whole process, I can see which sourceRecords failed and which ones succeeded.
The above pseudocode works perfectly.
However, I would like to take advantage of parallelism here, and have converted the foreach to a Parallel.ForEach(). But then I run into very unexpected errors (like TransactionAbortedException after calling ts.Complete(), NullReferenceException when calling ctx.SaveChanges(), or when setting one of the Result properties of sourceRecord, I sometimes get InvalidOperationException: EntityMemberChanged or EntityComplexMemberChanged was called without first calling EntityMemberChanging or EntityComplexMemberChanging on the same change tracker with the same property name).
So I’m thinking that parallelism, while being perfect for QUERIES, does not lend itself well to data UPDATES in EntityFramwork? What am I missing, or not understanding about parallelism? I don’t understand why my above approach breaks when converted to use parallelism. Any advice would be appreciated.
The EF object context isn’t thread safe, so there is a high potential for catastrophic errors when multiple threads are buzzing around in the same context.
It looks like you have at least one context object outside the foreach loop and shared among threads.
Based on your description I’d guess that updating properties on the sourceRecord entities from multiple threads is corrupting some internal state in the context – probably the collections of data it maintains for change tracking.