I’ve got a process which returns a lot of data (Requiring on the order of 50-100k entities when committed)
I’m performing all the Adds before I Commit (Unit Of Work pattern). I’m not too fussed about how long the commit actually takes – the process is very long-running (weeks) so a minute here or there isn’t the end of the world but during the commit, a second application using the same abstract entities and context is unable to read from the database table and eventually times out.
If I wait until the commit has happened and attempt to read again from app #2, it works near-instantly.
So, how can I tell EF not to lock the table (presumably what it’s doing?) during the commit?
The key code is here:
Dim DBTask = TaskRepository.Single(function(x) x.Id = CurrentTaskId)
''Task is a class which stores the results from the process before committing them
For Each Result In Task.Results
Dim DbResult = TaskResultRepository.CreateInstance
DbResult.Field1 = Result.Field1
DbResult.Field2 = Result.Field2
DbTask.Results.Add(DbResult)
Next
DbTask.JobStatus = Entities.JobStatuses.Completed
QLog.DebugFormat("Committing Task {1}: {0}", Task.Name, Task.Id)
UnitOfWork.Commit()
Tasks.Remove(Task)
QLog.InfoFormat("End Task {1}: {0}", Task.Name, Task.Id)
To make matters worse, I’m currently tool-deficient in that someone has misplaced the SQL Server install disk so no Management Studio/Performance Analyser, just Server Explorer.
For reference, my Repository(Of T As Entitybase).CreateInstance Method is:
Protected ReadOnly Entities As IDbSet(Of T)
Public Function CreateInstance() As T Implements Interfaces.IRepository(Of T).CreateInstance
Dim Entity = Entities.Create(Of T)()
With Entity
.CreatedOn = Now
End With
Entities.Add(Entity)
Return Entity
End Function
the problem is you are importing everything in a single UOW and the process takes weeks to complete(?) I really hope that is a typo and you don’t have transaction open for weeks. A minute is too long, a week is absurd.
What you describe is the concept of ETL. Extract, Transform, Load. where you are loading data into the db in bulk. You are using EF for this which means you are using an ORM which are optimized for small units of work. They are not meant as a tool for ETL.
the first problem is using an ORM to preform ETL. a better choice is Rhino.ETL or SSIS to manage the data import.
The second problem is the amount of data you are importing within a single transaction. Break this into chunks. may 1K, 5K records at a time. This will assist with the throughput and actually reduce the amount of time it takes to import all the data.
last thing to adjust, is you will want to manually control the transaction locking. It sounds like Serialization level locking is used which is the most restrictive and slowest. Nothing else can occur until the transaction is completed. You might find that ReadCommitted is a better locking level allowing reads to occur while data is written from another process.
but as for EF controlling another operations process. No that’s not possible.