I am using C# + VS2008 + .Net + ASP.Net + IIS 7.0 + ADO.Net + SQL Server 2008. I have a ADO.Net datatable object, and I want to filter out duplicate/similar records (in my specific rule to judge whether records are duplicate/similar — if record/row has the same value for a string column, I will treat them as duplicate/similar records), and only keep one of such duplicate/similar records.
The output needs to be a datatable, may output the same datatable object if filter operation could be operated on the same datatable object.
What is the most efficient solution?
Are you using .NET 3.5? If you cast your data rows, you can use LINQ to Objects:
Or an even simpler way, since you’re basing it on a single column’s values:
If you need to make a new DataTable out of these distinct rows, you can do this:
Or if it would be more handy to work with business objects, you can do an easy conversion:
Update
Everything you can do in LINQ to Objects can also be done without it: it just takes more code. For example:
You could also use a similar strategy to simply remove duplicate rows from the original table instead of creating a new table.