I have a List of Objects (roughly 100k) that is must iterate upon in order to produce a Dictionary.
however the code is performing very slowly, specifically on one line
public class Item{
public int ID;
public int Secondary_ID;
public string Text;
public int Number;
}
Data Looks something like (100k lines)
ID | Secondary_ID | Text | Number
1 | 1 | "something" | 3
1 | 1 | "something else"| 7
1 | 1 | "something1" | 4
1 | 2 | "something2" | 344
2 | 3 | "something3" | 74
2 | 3 | "something4" | 1
and i would like it to look like this when finished. (any collection will do to be honest)
Dictionary<int, string>
Key | Value
(secondary_ID) | (Text : Number)
1 | "Something : 3, Something else : 7, Something1 : 4"
2 | "Something2 : 344"
3 | "Something3 : 74, Something4 : 1"
My code currently works like this ListAll contains all data.
var Final=new Dictionary<int, string>();
var id1s=ListAll.Select(x => x.ID).Distinct().ToList();
foreach(var id1 in id1s) {
var shortList=ListAll.Where(x => x.ID==id1).ToList(); //99% of time spent is here
var id2s=shortList.Select(x => x.Secondary_ID).Distinct().ToList();
foreach(var id2 in id2s) {
var s=new StringBuilder();
var items=shortList.Where(x => x.Secondary_ID==id2).ToList();
foreach(var i in items) {
s.Append(String.Format("{0} : {1}", i.Text, i.Number));
}
Final.Add(id2, s.ToString());
}
}
return Final;
now the output is correct however as stated in the above comment, this takes an incredibly long time to process (90 seconds – certainly more than i am comfortable with) and was wondering if there is a faster way of achieving this.
This code is only going to be used once so is not really a normal usage and normally I would ignore it for that reason, but was wondering for learning purposes.
A much more efficient (and even easier to write) method of grouping the items by ID is to use
GroupBy.As for the reason that your code is so slow, you’re searching through the entire list for each distinct ID. That’s an O(n^2) operation.
GroupBydoesn’t do that. It uses a hash based structure internally, based on whatever you’re grouping on, so that it can quickly (in O(1) time) find the bucket that any given item belongs in, as opposed to the O(n) time it takes your method.