I’m using EF for a set of objects and want to query one type of object with the matching entries from two other tables. This is to dump the query into an XML output. At first, used a join since in the original data each Employee always had 1+ instances of a Computer object (solution #2 below), but it’s not necessarily true.
For purpose here, imagine:
- an Employee object,
- each Employee has an EmployeeType (a few fixed entries) and
- each Employee has zero or more Computer objects (typically 0-3).
- each Computer belongs to one Employee, not all Employee has a Computer.
- each Employee has a criteria on which the search is based (such as Division).
So I saw several possible solutions:
-
Use the Employee.Computer.Load() within a loop, but with 10,000+ rows, that causes an enormous performance penalty.
-
Use a join in the query, but that leaves out all Employees that do not have a Computer.
-
Use Linq to Entities,
but that seems to have the overhead of #1: when loading Computers it hits the database for EACH Employee. -
Use a second query (all Computers with matching Computer.Employee.Division), then in the Employee loop, add any Computer for the given employee. While implementing this, I found that just running the second query (with ToList()) the EF populates the correct Employee.Computer lists with the objects I need.
Here, #4 loads the data with only 2 database hits instead of 10k+, and the EntityFramework actually collates the objects and creates all of the relationships.
My Questions:
- With #4, is the fact that the EF populates the Employee.Computer list something on which I can I rely? If so, can you point me to the documentation?
- Is there a better way than #4?
UPDATE:
Well, bugger. Sorry, but I simply blew it. I was focusing on the
relationship with the “Computer” table and missed the fact that I
had an explicit Employee.EmployeeTypeReference.Load() w/o first testing
for null, so the “Computer” list was a complete non-issue.
I only found this when running some performance tests and adding
Craig’s solution to the mix. In truth, the records aren’t “Employees”
and “Computers” but abstractions, and I (conditionally) include every
field in the XML output, but they’re small: a Name, ID(PK) and ID(FK)
plus an INT on the “Employee” table. So, my presumption was that the
performance would be similar since the EF would create objects not
much heavier than the projection.
Anyway, here are the results wher the “Elapsed” time was
the difference before this query and after the resultant XML was created.
-
Case 1: Same as #2, but with Include() statements:
list = ve.Employee.Include("Computer").Include("EmployeeType").Where(e => e.Division.ID == divId).OrderBy(e => e.Name);Elapsed: 4.96, 5.05
-
Case 2: Uses in-line Load():
list = ve.Employee.Where(e => e.Division.ID == divId).OrderBy(e => e.Name);Elapsed: 74.62
-
Case 3: Same as #4, but with Include() statements:
list = from e in ve.Employee.Include("Computer").Include("EmployeeType")
where e.Division.ID == divId
orderby e.Name select e;Elapsed: 4.91, 5.47
-
Case 4: Uses in-line Load():
list = from e in ve.Employee where e.Division.ID == divId orderby e.Name select e;Elapsed: 74.20
-
Case 5: Use *Include(“EmployeeType”) and separate “Computer” query, let EF associate:
elist = ve.Employee.Include("EmployeeType").Where(te => te.Division.ID == divId).OrderBy(e => e.Name).ToList();
alist = ve.Alias.Where(a => a.Employee.Division.ID == divId).ToList();Elapsed: 4.50, 4.02
-
Case 6: Craig’s suggestion of projections:
elist = from te in ve.ThesaurusEntry where te.Division.ID==divID
orderby te.Name select new {
ID = te.ID, Name = te.Name,
Type = te.EmployeeType.Name,
Freq = te.Frequency,
Aliases = from a in te.Alias
select new {
ID = a.ID, Name = a.Name
}
};Elapsed: 0.73, 1.25
Conclusions
Load() is expensive, so use Include() or at least test with IsLoaded
Projection is a bit tedious, but significantly faster than EF fix-up. [with this limited testing on “narrow” tables]
Rob’s solution will work (+1), but if you don’t need all of the fields in, say, Employee and Computer, I’d strongly recommend projecting instead:
Here you get all you need in one query, but nothing more: All the fields you don’t need won’t be returned.
You could probably even select into
XElements and just save the resulting tree rather than manually converting to XML. I haven’t tried this, but it seems like it should work.Regarding #4, yes, you can rely on this, but it’s always good to check
IsLoadedbefore callingLoad().