I have to be able to sort out a list of object which might contains doubloon with certain criteria. So far the code is working but it takes 10 minutes for 50,000 rows in my list.
Here is the code :
public class TestObject
{
public string value1;
public string value2;
public string value3;
public string value4;
public int num1;
public int num2;
}
public static List<TestObject> ReturnTestObjectListWithoutDoubloon(List<TestObject> source)
{
var destination = new List<TestObject>();
var list = new Dictionary<int, TestObject>();
while (source.Count > 0)
{
list.Clear();
var originalElement = source[0];
foreach (var query in source.Select((element, index) => new { Value = element, Index = index })
.Where(currentElement => (currentElement.Value.value1 == originalElement.value1)
&& (currentElement.Value.value2 == originalElement.value2)
&& (currentElement.Value.value3 == originalElement.value3)
&& (currentElement.Value.value4 == originalElement.value4)))
{
list.Add(query.Index, query.Value);
}
if (list.Count > 1)
{
originalElement.num1 = list.Sum(a => a.Value.num1);
originalElement.num2 = list.Sum(a => a.Value.num2);
}
destination.Add(originalElement);
foreach (var positionToremove in list.Keys)
source.RemoveAt(positionToremove);
}
return destination;
}
The idea is to reduce the list each time I pass the while loop so my Linq request is executed on the smallest list possible. However the fewer doubloons I have , the slower it is. I am looking for a solution which allows me to have the smallest rutnime possible, memory is not an issue.
Does anyone have an idea ?
I’ve tried to follow your code through – and it looks like you are simply looking to remove duplicates from your source list?
If that is the case:
then I think your
source.RemoveAtcode might be broken as it might remove the wrong elements.then you should be able to run a single GroupBy() operation on this source list – that should work using a hashtable which should be much quicker than your existing loops-inside-loops operation.