Which way is the best for removing duplicates from a dataTable for multiple columns?I

Question

0

Asked: June 12, 20262026-06-12T22:49:33+00:00 2026-06-12T22:49:33+00:00

Which way is the best for removing duplicates from a dataTable for multiple columns?I

0

Which way is the best for removing duplicates from a dataTable for multiple columns?I mean below code is only for a single column.

public DataTable RemoveDuplicateRows(DataTable dTable, string colName)
{
   Hashtable hTable = new Hashtable();
   ArrayList duplicateList = new ArrayList();

   //Add list of all the unique item value to hashtable, which stores combination of key, value pair.
   //And add duplicate item value in arraylist.
   foreach (DataRow drow in dTable.Rows)
   {
      if (hTable.Contains(drow[colName]))
         duplicateList.Add(drow);
      else
         hTable.Add(drow[colName], string.Empty); 
   }

   //Removing a list of duplicate items from datatable.
   foreach (DataRow dRow in duplicateList)
      dTable.Rows.Remove(dRow);

   //Datatable which contains unique records will be return as output.
      return dTable;
}

I tried using string[] colName. It throws error at dTable.Rows.Remove(dRow);

Please suggest.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-12T22:49:34+00:00

The easiest and most readable is using Linq-to-DataTable:

var groups = from r in dTable.AsEnumerable()
             group r by new
             {
                 Col1 = r.Field<String>("Column1"),
                 Col2 = r.Field<String>("Column2"),
             };

// if you only want the first row of each group:
DataTable distinctTable = groups.Select(g => g.First()).CopyToDataTable();

Notes: Enumerable.GroupBy groups the DataRows by an anonymous type with two properties(Col1 and Col2) which are initialized from a DataRow fields Column1 and Column2.

So you get groups of IEnumerable<DataRow>. Enumerable.First() returns the first DataRow of each group (you could also use different methods to select the row you want to keep, for example by ordering by a date field).

Then CopyToDataTable creates a new DataTable from the (now) distinct DataRows.

Here’s a possible implementation if you’re using .NET 2:

implementation of a custom IEqualityComparer<Object[]> for the dictionary:

class ObjectArrayComparer : IEqualityComparer<Object[]>
{
    public bool Equals(Object[] x, Object[] y)
    {
        if (x == null && y == null) return true;
        if (x == null || y == null) return false;
        if (x.Length  !=  y.Length) return false;       

        for (int i = 0; i < x.Length; i++)
        {
            if (x[i] == null && y[i] == null) continue;
            if (x[i] == null || y[i] == null) return false;
            if (!x[i].Equals(y[i])) return false;
        }
        return true;
    }

    public int GetHashCode(Object[] obj)
    {
        int hash = 0;
        if (obj != null)
        {
            hash = (hash * 17) + obj.Length;
            foreach (Object o in obj)
            {
                hash *= 17;
                if (o != null) hash = hash + o.GetHashCode();
            }
        }
        return hash;
    }
}

your RemoveDuplicateRows method:

public DataTable RemoveDuplicateRows(DataTable dTable, String[] colNames)
{
    var hTable = new Dictionary<object[], DataRow>(new ObjectArrayComparer());

    foreach (DataRow drow in dTable.Rows)
    {
        Object[] objects = new Object[colNames.Length];
        for (int c = 0; c < colNames.Length; c++)
            objects[c] = drow[colNames[c]];
        if (!hTable.ContainsKey(objects))
            hTable.Add(objects, drow);
    }

    // create a clone with the same columns and import all distinct rows
    DataTable clone = dTable.Clone();
    foreach (var kv in hTable)
        clone.ImportRow(kv.Value);

    return clone;
}

testing:

var table = new DataTable();
table.Columns.Add("Colum1", typeof(string));
table.Columns.Add("Colum2", typeof(int));
table.Columns.Add("Colum3", typeof(string));

Random r = new Random();
for (int i = 0; i < 100; i++)
{
    table.Rows.Add("Colum1_" + r.Next(1, 10), r.Next(1, 10), "Colum3_" + r.Next(1, 10));
}
int rowCount = table.Rows.Count; // 100
var unique = RemoveDuplicateRows(table, new[] { "Colum1", "Colum2" });
int uniqueRowCount = unique.Rows.Count; // around 55-65

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Which way is the best for removing duplicates from a dataTable for multiple columns?I

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply