I am working on C# where I am reading a huge table from 1 database and loading it to my DataTable.
Since the table comprises of a huge set of rows (1,800,000+) and I keep getting an out-of-memory error I tried to break it down and copy it 100,000 rows at a time and clear up memory and redo till all the data in the table from the source gets loaded to my DataTable.
Can you just look at my code and tell me if I am on the right track? From what it looked to me I was reading the first 100,000 rows again and again and my program is running indefinitely.
Is there a counter I need to be adding to my DataTable? So that it adds the next set of rows???
My code snippet is below:
public IoSqlReply GetResultSet(String directoryName, String userId, String password, String sql)
{
IoSqlReply ioSqlReply = new IoSqlReply();
DataTable dtResultSet = new DataTable();
IoMsSQL ioMsSQL = null;
int chunkSize = 100000;
try
{
using (OdbcConnection conn = new OdbcConnection(cs))
{
conn.Open();
using (OdbcCommand cmd = new OdbcCommand(sql, conn))
{
using (OdbcDataReader reader = cmd.ExecuteReader())
{
for (int col = 0; col < reader.FieldCount; col++)
{
String colName = reader.GetName(col);
String colDataType = reader.GetFieldType(col).ToString(); ;
dtResultSet.Columns.Add(reader.GetName(col), reader.GetFieldType(col));
}
// now copy each row/column to the datatable
while (reader.Read()) // loop round all rows in the source table
{
DataRow row = dtResultSet.NewRow();
for (int ixCol = 0; ixCol < reader.FieldCount; ixCol++) // loop round all columns in each row
{
row[ixCol] = reader.GetValue(ixCol);
}
// -------------------------------------------------------------
// finished processing the row, add it to the datatable
// -------------------------------------------------------------
dtResultSet.Rows.Add(row);
GC.Collect(); // free up memory
}//closing while
ioSqlReply.DtResultSet = dtResultSet; // return the data table
ioSqlReply.RowCount = dtResultSet.Rows.Count;
Console.WriteLine("DTRESULTSET:ROW COUNT FINAL : " + dtResultSet.Rows.Count);
ioSqlReply.Rc = 0;
}
}
}
}
You should limit your amount of rows in your Sql for example…
if you don’t do this, and you have 1.8M in your query, then there is no system able to handle it.
But this will make your app to process only the first 10000 rows… if you need to process all rows, then you should iterate the execution of that sql unitl there are not more rows… for example
This is a very rough example… It can be improved but I think it is an easy fix for your problem.