Some basics
I have two tables, one holding the users and one holding a log with logins.
The user table holds something like 15000+ users, the login table is growing and is reaching 150000+ posts.
The database is built upon SQL Server (not express).
To administer the users I got a gridview (ASPxGridView from Devexpress) that I populate from an ObjectDatasource.
Is there any general do’s and donts I should know about when summarizing the number of logins a user made.
Things are getting strangely slow.
Here is a picture showing the involved tables.

I’ve tried a few things.
DbDataContext db = new DbDataContext();
// Using foregin key relationship
foreach (var proUser in db.tblPROUsers)
{
var count = proUser.tblPROUserLogins.Count;
//...
}
Execution time: 01:29.316 (1 minute and 29 seconds)
// By storing a list in a local variable (I removed the FK relation)
var userLogins = db.tblPROUserLogins.ToList();
foreach (var proUser in db.tblPROUsers)
{
var count = userLogins.Where(x => x.UserId.Equals(proUser.UserId)).Count();
//...
}
Execution time: 01:18.410 (1 minute and 18 seconds)
// By storing a dictionary in a local variable (I removed the FK relation)
var userLogins = db.tblPROUserLogins.ToDictionary(x => x.UserLoginId, x => x.UserId);
foreach (var proUser in db.tblPROUsers)
{
var count = userLogins.Where(x => x.Value.Equals(proUser.UserId)).Count();
//...
}
Execution time: 01:15.821 (1 minute and 15 seconds)
The model giving the best performance is actually the dictionary. However I you know of any options I’d like to hear about it, also if there’s something “bad” with this kind of coding when handling such large amounts of data.
Thanks
========================================================
UPDATED With a model according to BrokenGlass example
// By storing a dictionary in a local variable (I removed the FK relation)
foreach (var proUser in db.tblPROUsers)
{
var userId = proUser.UserId;
var count = db.tblPROUserLogins.Count(x => x.UserId.Equals(userId));
//...
}
Execution time: 02:01.135 (2 minutes and 1 second)
In addition to this I created a list storing a simple class
public class LoginCount
{
public int UserId { get; set; }
public int Count { get; set; }
}
And in the summarizing method
var loginCount = new List<LoginCount>();
// This foreach loop takes approx 30 secs
foreach (var login in db.tblPROUserLogins)
{
var userId = login.UserId;
// Check if available
var existing = loginCount.Where(x => x.UserId.Equals(userId)).FirstOrDefault();
if (existing != null)
existing.Count++;
else
loginCount.Add(new LoginCount{UserId = userId, Count = 1});
}
// Calling it
foreach (var proUser in tblProUser)
{
var user = proUser;
var userId = user.UserId;
// Count logins
var count = 0;
var loginCounter = loginCount.Where(x => x.UserId.Equals(userId)).FirstOrDefault();
if(loginCounter != null)
count = loginCounter.Count;
//...
}
Execution time: 00:36.841 (36 seconds)
Conclusion so far, summarizing with linq is slow, but Im getting there!
Perhaps it would be useful if you tried to construct an SQL query that does the same thing and executing it independently of your application (in SQL Server Management Studio). Something like:
(NOTE: This just selects
UserId. If you want other fields fromtblPROUser, you’ll need a simple JOIN “on top” of this basic query.)Ensure there is a composite index on {UserId, UserLoginId} and it is being used by the query plan. Having both fields in the index and in that order ensures your query can run without touching the
tblPROUserLogintable:Then benchmark and see if you can get a significantly better time than your LINQ code:
— EDIT —
The follwing LINQ snippet is equivalent to the query above:
Which prints the following text in the console:
— EDIT 2 —
To have the “whole” user and not just
UserId, you can do this:And the console output shows the following query…
…which should be very efficient provided your indexes are correct: