I have a C++ method signature that looks like this:
static extern void ImageProcessing( [MarshalAs(UnmanagedType.LPArray)]ushort[] inImage, [MarshalAs(UnmanagedType.LPArray)]ushort[] outImage, int inYSize, int inXSize);
I’ve wrapped the function in timing methods, both internal and external. Internally, the function is running at 0.24s. Externally, the function runs in 2.8s, or about 12 times slower. What’s going on? Is marshalling slowing me down that much? If it is, how can I get around that? Should I go to unsafe code and use pointers or something? I’m sort of flummoxed as to where the extra time cost is coming from.
The answer is, sadly, far more mundane than these suggestions, although they do help. Basically, I messed up with how I was doing timing.
The timing code that I was using was this:
This code is specific to the intel compiler, and is designed to give extremely precise time measurements. Unfortunately, that extreme precision means a cost of roughly 2.5 seconds per run. Removing the timing code removed that time constraint.
There still appears to be a delay of the runtime, though– the code would report 0.24 s with that timing code on, and is now reporting timing of roughly 0.35s, which means that there’s about a 50% speed cost.
Changing the code to this:
and called like:
drops the executable time to 0.3 s (average of three runs). Still too slow for my tastes, but a 10x speed improvement is certainly within the realm of acceptability for my boss.