Basic Details
I have a linq statement that grabs some records from a database and puts them in a System.Linq.Enumerable:
var someRecords = someRepoAttachedToDatabase.Where(p=>true);
Suppose this grabs tons (25k+) of records, and i need to perform update operations on all of them. to speed things up, I have to decided to use paging and perform the operations needed in blocks of 100 instead of all of the records at the same time.
The code in questions is used in two places: a service method that updates a lot of values in a database, and an integration test that gets the old and updated values to make sure the update was performed correctly.
The Question
The line in question is the line where I count the number of records in the subset to see if we are on the last page; if the number of records in subset is less than the size of paging – then that means there are no more records left. What I would like to know is what is the fastest way to do this?
Code in Question
int pageSize = 100;
bool moreData = true;
int currentPage = 1;
while (moreData)
{
var subsetOfRecords = someRecords.Skip((currentPage - 1) * pageSize).Take(pageSize); //this is also a System.Linq.Enumerable
if (subsetOfRecords.Count() < pageSize){ moreData = false;} //line in question
//do stuff to records in subset
currentPage++;
}
Things I Have Considered
- subsetOfRecords.Count() < pageSize
- subsetOfRecords.ElementAt(pageSize – 1) == null (causes out of bounds exception – can catch exception and set moreData to false there)
- Converting subsetOfRecords to an array (converting someRecords to an array will not work due to the way subsetOfRecords is declared – but I am open to changing it)
I’m sure there are plenty of other ideas that I have missed.
use the parallels library. It will handle the paralellezation and paging for you automatically. Is the order that the records are processed in important?