I always seem to have a problem when I need to compare 2 list and produce a 3rd list which include all unique items.I need to perform this quite often.
Attempt to reproduce the issue with a noddy example.
Am I missing something?
Thanks for any suggestions
The wanted result
Name= Jo1 Surname= Bloggs1 Category= Account
Name= Jo2 Surname= Bloggs2 Category= Sales
Name= Jo5 Surname= Bloggs5 Category= Development
Name= Jo6 Surname= Bloggs6 Category= Management
Name= Jo8 Surname= Bloggs8 Category= HR
Name= Jo7 Surname= Bloggs7 Category= Cleaning
class Program
{
static void Main(string[] args)
{
List<Customer> listOne = new List<Customer>();
List<Customer> listTwo = new List<Customer>();
listOne.Add(new Customer { Category = "Account", Name = "Jo1", Surname = "Bloggs1" });
listOne.Add(new Customer { Category = "Sales", Name = "Jo2", Surname = "Bloggs2" });
listOne.Add(new Customer { Category = "Development", Name = "Jo5", Surname = "Bloggs5" });
listOne.Add(new Customer { Category = "Management", Name = "Jo6", Surname = "Bloggs6" });
listTwo.Add(new Customer { Category = "HR", Name = "Jo8", Surname = "Bloggs8" });
listTwo.Add(new Customer { Category = "Sales", Name = "Jo2", Surname = "Bloggs2" });
listTwo.Add(new Customer { Category = "Management", Name = "Jo6", Surname = "Bloggs6" });
listTwo.Add(new Customer { Category = "Development", Name = "Jo5", Surname = "Bloggs5" });
listTwo.Add(new Customer { Category = "Cleaning", Name = "Jo7", Surname = "Bloggs7" });
List<Customer> resultList = listOne.Union(listTwo).ToList();//**I get duplicates why????**
resultList.ForEach(customer => Console.WriteLine("Name= {0} Surname= {1} Category= {2}", customer.Name, customer.Surname, customer.Category));
Console.Read();
IEnumerable<Customer> resultList3 = listOne.Except(listTwo);//**Does not work**
foreach (var customer in resultList3)
{
Console.WriteLine("Name= {0} Surname= {1} Category= {2}", customer.Name, customer.Surname, customer.Category);
}
**//Does not work**
var resultList2 = (listOne
.Where(n => !(listTwo
.Select(o => o.Category))
.Contains(n.Category)))
.OrderBy(n => n.Category);
foreach (var customer in resultList2)
{
Console.WriteLine("Name= {0}
Surname= {1}
Category= {2}",
customer.Name,
customer.Surname,
customer.Category);
}
Console.Read();
}
}
public class Customer
{
public string Name { get; set; }
public string Surname { get; set; }
public string Category { get; set; }
}
The crux of the problem is the Customer object doesn’t have a .Equals() implementation. If you override .Equals (and .GetHashCode) then .Distinct would use it to eliminate duplicates. If you don’t own the Customer implementation, however, adding .Equals may not be an option.
An alternative is to pass a custom IEqualityComparer to .Distinct(). This lets you compare objects in different ways depending on which comparer you pass in.
Another alternative is to GroupBy the fields that are important and take any item from the group (since the GroupBy acts as .Equals in this case). This requires the least code to be written.
e.g.
which gets your desired result.
As a rule I use a unique delimiter to combine fields so that two items that should be different don’t unexpectedly combine to the same key. consider:
{Name=abe, Surname=long}and{Name=abel, Surname=ong}would both get the GroupBy key"abelong"if a delimiter isn’t used.