I have been experimenting with ways to read data from a SQL server as quickly as possible and I came across an interesting discovery. If I read the data into a List<object[]> instead of a List<string[]>, performance increases by more than double.
I suspect this is due to not having to call the ToString() method on the fields, but I always thought that using objects had a negative impact on performance.
Is there any reason to not use a list of object arrays instead of string arrays?
EDIT: One thought I just had was the storage size of this data. Will storing the data in object arrays take more room than as strings?
Here is my test code:
private void executeSqlObject()
{
List<object[]> list = new List<object[]>();
using (SqlConnection cnn = new SqlConnection(_cnnString))
{
cnn.Open();
SqlCommand cmd = new SqlCommand("select * from test_table", cnn);
SqlDataReader reader = cmd.ExecuteReader();
int fieldCount = reader.FieldCount;
while (reader.Read())
{
object[] row = new object[fieldCount];
for (int i = 0; i < fieldCount; i++)
{
row[i] = reader[i];
}
list.Add(row);
}
}
}
private void executeSqlString()
{
List<string[]> list = new List<string[]>();
using (SqlConnection cnn = new SqlConnection(_cnnString))
{
cnn.Open();
SqlCommand cmd = new SqlCommand("select * from test_table", cnn);
SqlDataReader reader = cmd.ExecuteReader();
int fieldCount = reader.FieldCount;
while (reader.Read())
{
string[] row = new string[fieldCount];
for (int i = 0; i < fieldCount; i++)
{
row[i] = reader[i].ToString();
}
list.Add(row);
}
}
}
private void runTests()
{
Stopwatch watch = new Stopwatch();
for (int i = 0; i < 10; i++)
{
watch.Start();
executeSqlObject();
Debug.WriteLine("Object Time: " + watch.ElapsedMilliseconds.ToString());
watch.Reset();
}
for (int i = 0; i < 10; i++)
{
watch.Start();
executeSqlString();
Debug.WriteLine("String Time: " + watch.ElapsedMilliseconds.ToString());
watch.Reset();
}
}
And the results:
Object Time: 879
Object Time: 812
Object Time: 825
Object Time: 882
Object Time: 880
Object Time: 905
Object Time: 815
Object Time: 799
Object Time: 823
Object Time: 817
Average: 844
String Time: 1819
String Time: 1790
String Time: 1787
String Time: 1856
String Time: 1795
String Time: 1731
String Time: 1792
String Time: 1799
String Time: 1762
String Time: 1869
Average: 1800
objectonly adds overhead if you are causing additional boxing. And even then, this impact is fairly minimal. In your case,reader[i]always returnsobject. You already have it asobject, no matter whether that is a reference to a string, or an int, etc. Of course calling.ToString()adds overhead; in most cases (int, DateTime, etc) this involves both formatting code and the allocation of one (or more) extra string. By changing tostringyou are changing the data (for the worse, IMO – for example, you can no longer do correct sorts on dates, for example) and adding overhead. The edge case here is if all the columns are already actually strings – in which case you just add a few virtual method calls (but no extra real work).For info, if you are after raw performance, I thoroughly recommend looking at the micro-ORMs such as dapper. They are heavily optimised, but avoid the weight of “full” ORMs. For example, in dapper:
will, I expect, perform very comparably while giving you strongly typed object data.