Edit: I apologize everybody. I used the term ‘jagged array’ when I actually meant to say ‘multi-dimensional array’ (as can be seen in my example below). I apologize for using the incorrect name. I actually found jagged arrays to be faster than multi-dimensional ones! I have added my measurements for jagged arrays.
I was trying to use a jagged multi-dimensional array today, when I noticed that it’s performance is not as I would have expected. Using a single-dimensional array and manually calculating indices was much faster (almost two times) than using a 2D array. I wrote a test using 1024*1024 arrays (initialized to random values), for 1000 iterations, and I got the following results on my machine:
sum(double[], int): 2738 ms (100%) sum(double[,]): 5019 ms (183%) sum(double[][]): 2540 ms ( 93%)
This is my test code:
public static double sum(double[] d, int l1) { // assuming the array is rectangular double sum = 0; int l2 = d.Length / l1; for (int i = 0; i < l1; ++i) for (int j = 0; j < l2; ++j) sum += d[i * l2 + j]; return sum; } public static double sum(double[,] d) { double sum = 0; int l1 = d.GetLength(0); int l2 = d.GetLength(1); for (int i = 0; i < l1; ++i) for (int j = 0; j < l2; ++j) sum += d[i, j]; return sum; } public static double sum(double[][] d) { double sum = 0; for (int i = 0; i < d.Length; ++i) for (int j = 0; j < d[i].Length; ++j) sum += d[i][j]; return sum; } public static void Main() { Random random = new Random(); const int l1 = 1024, l2 = 1024; double[ ] d1 = new double[l1 * l2]; double[,] d2 = new double[l1 , l2]; double[][] d3 = new double[l1][]; for (int i = 0; i < l1; ++i) { d3[i] = new double[l2]; for (int j = 0; j < l2; ++j) d3[i][j] = d2[i, j] = d1[i * l2 + j] = random.NextDouble(); } // const int iterations = 1000; TestTime(sum, d1, l1, iterations); TestTime(sum, d2, iterations); TestTime(sum, d3, iterations); }
Further investigation showed that the IL for the second method is 23% larger than that of the first method. (Code size 68 vs 52.) This is mostly due to calls to System.Array::GetLength(int). The compiler also emits calls to Array::Get for the jagged multi-dimensional array, whereas it simply calls ldelem for the simple array.
So I am wondering, why is access through multi-dimensional arrays slower than normal arrays? I would have assumed the compiler (or JIT) would do something similar to what I did in my first method, but this was not actually the case.
Could you plese help me understand why this is happening the way it is?
Update: Following Henk Holterman’s suggestion, here is the implementation of TestTime:
public static void TestTime<T, TR>(Func<T, TR> action, T obj, int iterations) { Stopwatch stopwatch = Stopwatch.StartNew(); for (int i = 0; i < iterations; ++i) action(obj); Console.WriteLine(action.Method.Name + ' took ' + stopwatch.Elapsed); } public static void TestTime<T1, T2, TR>(Func<T1, T2, TR> action, T1 obj1, T2 obj2, int iterations) { Stopwatch stopwatch = Stopwatch.StartNew(); for (int i = 0; i < iterations; ++i) action(obj1, obj2); Console.WriteLine(action.Method.Name + ' took ' + stopwatch.Elapsed); }
Single dimensional arrays with a lower bound of 0 are a different type to either multi-dimensional or non-0 lower bound arrays within IL (
vectorvsarrayIIRC).vectoris simpler to work with – to get to element x, you just dopointer + size * x. For anarray, you have to dopointer + size * (x-lower bound)for a single dimensional array, and yet more arithmetic for each dimension you add.Basically the CLR is optimised for the vastly more common case.