I query a database and push the result out to the console and a file using two methods like this:
var result = from p in _db.Pages
join r in rank on p.PkId equals r.Key
orderby r.NumPages descending
select new KeyNameCount
{
PageID = p.PkId,
PageName = p.Name,
Count = r.NumPages
};
WriteFindingsToFile(result, parentId);
WriteFindingsToConsole(result, parentId);
IEnumerable<T>, not IQuerable<T> is used as parametertype for result when used as a parameter in the two method calls.
In both calls the result is iterated in a foreach. This leads to two identical calls against the database, one for each method.
I could refactor the two methods into one and only use one foreach, but that would fast become very hard to maintain (adding write to html, write to xml, etc.)
I am pretty sure this is a farly common question, but using google has not made me any wiser, so I turn to you guys 🙂
Every time you access a LINQ query it will requery the database to refresh the data. To stop this happening use
.ToArray()or.ToList().e.g.
It’s important to understand that a raw LINQ query is run when it is used, not when it is written in the code (e.g. don’t dispose of your _db before then).
It can be surprising when you realise how it really works. This allows method chaining and later modification of the query to be reflected in the final query run on the DB. It is important to always keep in mind as it can cause run-time bugs that will not be caught at compile time, usually because the DB connection is closed before the list is used, as you are passing around what appear to be a simple IEnumerable.
EDIT: Changed to remove my opinion and reflect the discussion in the comments. Personally I think the compiler should assume that the end result of chained queries is immediately run unless you explicitly say that it’ll be further modified later. Just to avoid the run-time bugs it inevitably causes.