I have simple methods to export DataTable to XLS using string. Number of columns is 5 – 30, and number or rows might be from 1 to 1000. Sometimes there is problem with performance, and I please for advice what can I change in my code. I’m using .net 4.0
public string FormatCell(string columnName, object value)
{
StringBuilder builder = new StringBuilder();
string formattedValue = string.Empty;
string type = "String";
string style = "s21";
if (!(value is DBNull) && columnName.Contains("GIS"))
formattedValue = Convert.ToDouble(value).ToString("##.00000000°");
else if (value is DateTime)
{
style = "s22";
type = "DateTime";
DateTime date = (DateTime)value;
formattedValue = date.ToString("yyyy-MM-ddTHH:mm:ss.fff");
}
else if (value is double || value is float || value is decimal)
{
formattedValue = Convert.ToDecimal(value).ToString("#.00").Replace(',', '.');
type = "Number";
}
else if (value is int)
{
formattedValue = value.ToString();
type = "Number";
}
else
formattedValue = value.ToString();
builder.Append(string.Format("<Cell ss:StyleID=\"{0}\"><Data ss:Type=\"{1}\">", style, type));
builder.Append(formattedValue);
builder.AppendLine("</Data></Cell>");
return builder.ToString();
}
public string ConvertToXls(DataTable table)
{
StringBuilder builder = new StringBuilder();
int rows = table.Rows.Count + 1;
int cols = table.Columns.Count;
builder.AppendLine("<?xml version=\"1.0\" encoding=\"UTF-8\" ?>");
builder.AppendLine("<?mso-application progid=\"Excel.Sheet\"?>");
builder.AppendLine("<Workbook xmlns=\"urn:schemas-microsoft-com:office:spreadsheet\"");
builder.AppendLine(" xmlns:o=\"urn:schemas-microsoft-com:office:office\"");
builder.AppendLine(" xmlns:x=\"urn:schemas-microsoft-com:office:excel\"");
builder.AppendLine(" xmlns:ss=\"urn:schemas-microsoft-com:office:spreadsheet\"");
builder.AppendLine(" xmlns:html=\"http://www.w3.org/TR/REC-html40/\">");
builder.AppendLine(" <DocumentProperties xmlns=\"urn:schemas-microsoft-com:office:office\">;");
builder.AppendLine(" <Author>Author</Author>");
builder.AppendLine(string.Format(" <Created>{0}T{1}Z</Created>", DateTime.Now.ToString("yyyy-mm-dd"), DateTime.Now.ToString("HH:MM:SS")));
builder.AppendLine(" <Company>Company</Company>");
builder.AppendLine(" <Version>1.0</Version>");
builder.AppendLine(" </DocumentProperties>");
builder.AppendLine(" <ExcelWorkbook xmlns=\"urn:schemas-microsoft-com:office:excel\">");
builder.AppendLine(" <WindowHeight>8955</WindowHeight>");
builder.AppendLine(" <WindowWidth>11355</WindowWidth>");
builder.AppendLine(" <WindowTopX>480</WindowTopX>");
builder.AppendLine(" <WindowTopY>15</WindowTopY>");
builder.AppendLine(" <ProtectStructure>False</ProtectStructure>");
builder.AppendLine(" <ProtectWindows>False</ProtectWindows>");
builder.AppendLine(" </ExcelWorkbook>");
builder.AppendLine(" <Styles>");
builder.AppendLine(" <Style ss:ID=\"Default\" ss:Name=\"Normal\">");
builder.AppendLine(" <Alignment ss:Vertical=\"Bottom\"/>");
builder.AppendLine(" <Borders/>");
builder.AppendLine(" <Font/>");
builder.AppendLine(" <Interior/>");
builder.AppendLine(" <Protection/>");
builder.AppendLine(" </Style>");
builder.AppendLine(" <Style ss:ID=\"s21\">");
builder.AppendLine(" <Alignment ss:Vertical=\"Bottom\" ss:WrapText=\"1\"/>");
builder.AppendLine(" </Style>");
builder.AppendLine(" <Style ss:ID=\"s22\">");
builder.AppendLine(" <NumberFormat ss:Format=\"Short Date\"/>");
builder.AppendLine(" </Style>");
builder.AppendLine(" </Styles>");
builder.AppendLine(" <Worksheet ss:Name=\"Export\">");
builder.AppendLine(string.Format(" <Table ss:ExpandedColumnCount=\"{0}\" ss:ExpandedRowCount=\"{1}\" x:FullColumns=\"1\"", cols.ToString(), rows.ToString()));
builder.AppendLine(" x:FullRows=\"1\">");
//generate title
builder.AppendLine("<Row>");
foreach (DataColumn eachColumn in table.Columns) // you can write a half columns of table and put the remaining columns in sheet2
{
if (eachColumn.ColumnName != "ID")
{
builder.Append("<Cell ss:StyleID=\"s21\"><Data ss:Type=\"String\">");
builder.Append(eachColumn.ColumnName.ToString());
builder.AppendLine("</Data></Cell>");
}
}
builder.AppendLine("</Row>");
//generate data
foreach (DataRow eachRow in table.Rows)
{
builder.AppendLine("<Row>");
foreach (DataColumn eachColumn in table.Columns)
{
if (eachColumn.ColumnName != "ID")
{
builder.AppendLine(FormatCell(eachColumn.ColumnName, eachRow[eachColumn]));
}
}
builder.AppendLine("</Row>");
}
builder.AppendLine(" </Table>");
builder.AppendLine(" <WorksheetOptions xmlns=\"urn:schemas-microsoft-com:office:excel\">");
builder.AppendLine(" <Selected/>");
builder.AppendLine(" <Panes>");
builder.AppendLine(" <Pane>");
builder.AppendLine(" <Number>3</Number>");
builder.AppendLine(" <ActiveRow>1</ActiveRow>");
builder.AppendLine(" </Pane>");
builder.AppendLine(" </Panes>");
builder.AppendLine(" <ProtectObjects>False</ProtectObjects>");
builder.AppendLine(" <ProtectScenarios>False</ProtectScenarios>");
builder.AppendLine(" </WorksheetOptions>");
builder.AppendLine(" </Worksheet>");
builder.AppendLine(" <Worksheet ss:Name=\"Sheet2\">");
builder.AppendLine(" <WorksheetOptions xmlns=\"urn:schemas-microsoft-com:office:excel\">");
builder.AppendLine(" <ProtectObjects>False</ProtectObjects>");
builder.AppendLine(" <ProtectScenarios>False</ProtectScenarios>");
builder.AppendLine(" </WorksheetOptions>");
builder.AppendLine(" </Worksheet>");
builder.AppendLine(" <Worksheet ss:Name=\"Sheet3\">");
builder.AppendLine(" <WorksheetOptions xmlns=\"urn:schemas-microsoft-com:office:excel\">");
builder.AppendLine(" <ProtectObjects>False</ProtectObjects>");
builder.AppendLine(" <ProtectScenarios>False</ProtectScenarios>");
builder.AppendLine(" </WorksheetOptions>");
builder.AppendLine(" </Worksheet>");
builder.AppendLine("</Workbook>");
return builder.ToString();
}
using this:
string xlsData= ConvertToXls(someTable)
System.CodeDom.Compiler.TempFileCollection fileCollection = new System.CodeDom.Compiler.TempFileCollection();
string tempFileName = fileCollection.AddExtension("xls", true);
if (File.Exists(tempFileName))
File.Delete(tempFileName);
using (StreamWriter writer = new StreamWriter(tempFileName, false, Encoding.UTF8))
writer.Write(xlsData);
The simplest thing you can do is declare StringBuilder with a capacity other than the default value, e.g.
The default allocation is 16 bytes, and doubles each time it needs to be re-allocated. This means that it’s going to be reallocated many times if you use the default.
Unless your system is tight on memory, or this is really, really huge, I doubt that streaming it directly as was suggested before would make much difference. I suspect it could actually make things marginally worse, since I doubt there’s less overhead to a file stream write versus adding data to a StreamBuilder object that’s already been allocated (assuming it doesn’t need to be reallocated frequently!)
The optimal solution might be to send the stringbuilder output to a stream periodically as it grows to some size (based on the memory of your system) if it could be more than, say, a 10 or 20 megabytes. That way you would avoid memory issues as well as avoid any potential overhead associated with many small writes to the output stream.
Update – Testing note:
I ran some tests creating very large strings (>50 megabytes)and there is little appreciable difference allocating memory in advance.
But more importantly, the amount of time required just to create such a string using the simplest possible form:
is almost inconsequential. I can fill up all of my desktop computer’s memory in a couple seconds.
What this means is that the overhead of StringBuilder is not at all your problem. One can also deduce from this that switching to a stream write will definitely not help you either.
Instead, you need to look at some of the operations you are doing thousands or tens of thousands of times. This loop::
ColumnName!=”ID” by removing that
from your select
Suggestion for improving FormatCell:
To build the index, I think the most efficient way would be to map the column numbers to an array that defines the type for each column, like the code below, then in FormatCell just use your pre-built map of columnnumbers to datatypes.
Then pass FormatCell the columnumber and it can look up the datatype from the array, and just check with a switch:
I think this would cut down a lot of overhead.