We’re having to transform a datatable into another which involves mostly transposing the source datatable into a different format.
Thinking of it sequentially, I have implemented it as below:
DataTable riskTable;
this.InitializeRmmRiskTable(out riskTable); // Initializes the columns
var calculatedRisk = (from DataRow tradeRow in tradeTableToFilter.Rows
where tradeRow["TradeID"] != null
select new
{
ROW_ID = 0,
TCN = tradeRow["TradeID"].ToString(),
CCY = tradeRow["CURRENCY"],
USD_VALUE = calculator.Invoke(tradeRow) // configured delegate that will fetch the value
}).Distinct();
foreach (var rowData in calculatedRisk)
{
DataRow rowToAdd = riskTable.NewRow();
rowToAdd["ROW_ID"] = rowData.ROW_ID;
rowToAdd["TCN"] = rowData.TCN;
rowToAdd["CCY"] = rowData.CCY;
rowToAdd["USD_VALUE"] = rowData.USD_VALUE;
riskTable.Rows.Add(rowToAdd);
}
return riskTable;
Any suggestions to optimize this in terms of memory-footprint and execution cycles?
You can remove a lot of column lookups easily enough:
This assumes you know the order of the columns. Otherwise, the
DataColumnAPI is the most direct, so you could store the 4DataColumns and use those in the indexer. This applies equally to the reading code, i.e.then:
etc.
The use of
Distinctdoes mean that all the objects will be buffered in memory again; if you know that you need this, fine, but in many cases this may be redundant.