I’m reading a CSV file and the records are recorded as a string[]. I want to take each record and convert it into a custom object.
T GetMyObject<T>();
Currently I’m doing this through reflection which is really slow. I’m testing with a 515 Meg file with several million records. It takes under 10 seconds to parse. It takes under 20 seconds to create the custom objects using manual conversions with Convert.ToSomeType but around 4 minutes to do the conversion to the objects through reflection.
What is a good way to handle this automatically?
It seems a lot of time is spent in the PropertyInfo.SetValue method. I tried caching the properties MethodInfo setter and using that instead, but it was actually slower.
I have also tried converting that into a delegate like the great Jon Skeet suggested here: Improving performance reflection , what alternatives should I consider, but the problem is I don’t know what the property type is ahead of time. I’m able to get the delegate
var myObject = Activator.CreateInstance<T>();
foreach( var property in typeof( T ).GetProperties() )
{
var d = Delegate.CreateDelegate( typeof( Action<,> )
.MakeGenericType( typeof( T ), property.PropertyType ), property.GetSetMethod() );
}
The problem here is I can’t cast the delegate into a concrete type like Action<T, int>, because the property type of int isn’t known ahead of time.
The first thing I’d say is write some sample code manually that tells you what the absolute best case you can expect is – see if your current code is worth fixing.
If you are using
PropertyInfo.SetValueetc, then absolutely you can make it quicker, even with jutsobject– HyperDescriptor might be a good start (it is significantly faster than raw reflection, but without making the code any more complicated).For optimal performance, dynamic IL methods are the way to go (precompiled once); in 2.0/3.0, maybe
DynamicMethod, but in 3.5 I’d favorExpression(withCompile()). Let me know if you want more detail?Implementation using
ExpressionandCsvReader, that uses the column headers to provide the mapping (it invents some data along the same lines); it usesIEnumerable<T>as the return type to avoid having to buffer the data (since you seem to have quite a lot of it):Second version (see comments) that uses
TypeConverterrather thanParse: