I know this as asked many times but cannot see something that works.
I am reading a csv file and then I have to remove duplicate lines based on one of the columns “CustomerID”.
Basically the CSV file can have multiple lines with the same customerID.
I need to remove the duplicates.
//DOES NOT WORK
var finalCustomerList = csvCustomerList.Distinct().ToList();
I have also tried this extension method //DOES NOT WORK
public static IEnumerable<t> RemoveDuplicates<t>(this IEnumerable<t> items)
{
return new HashSet<t>(items);
}
What works for me is
- I Read the CSV file into a csvCustomerList
-
Loop through csvCustomerList and check if a
customerExists If it doesnt I add
it.foreach (var csvCustomer in csvCustomerList) { var Customer = new customer(); customer.CustomerID = csvCustomer.CustomerID; customer.Name = csvCustomer.Name; //etc..... var exists = finalCustomerList.Exists(x => x.CustomerID == csvCustomer.CustomerID); if (!exists) { finalCustomerList.Add(customer); } }Is there a better way of doing this?
For
Distinctto work with non standard equality checks, you need to make your classcustomerimplementIEquatable<T>. In theEqualsmethod, simply compare the customer ids and nothing else.As an alternative, you can use the overload of Distinct that requires an
IEqualityComparer<T>and create a class that implements that interface forcustomer. Like that, you don’t need to change thecustomerclass.Or you can use Morelinq as suggested by another answer.