I’m new to C from many years of Matlab for numerical programming. I’ve developed a program to solve a large system of differential equations, but I’m pretty sure I’ve done something stupid as, after profiling the code, I was surprised to see three loops that were taking ~90% of the computation time, despite the fact they are performing the most trivial steps of the program.
My question is in three parts based on these expensive loops:
-
Initialization of an array to zero. When J is declared to be a double array are the values of the array initialized to zero? If not, is there a fast way to set all the elements to zero?
void spam(){ double J[151][151]; /* Other relevant variables declared */ calcJac(data,J,y); /* Use J */ } static void calcJac(UserData data, double J[151][151],N_Vector y) { /* The first expensive loop */ int iter, jter; for (iter=0; iter<151; iter++) { for (jter = 0; jter<151; jter++) { J[iter][jter] = 0; } } /* More code to populate J from data and y that runs very quickly */ } -
During the course of solving I need to solve matrix equations defined by P = I – gamma*J. The construction of P is taking longer than solving the system of equations it defines, so something I’m doing is likely in error. In the relatively slow loop below, is accessing a matrix that is contained in a structure ‘data’ the the slow component or is it something else about the loop?
for (iter = 1; iter<151; iter++) { for(jter = 1; jter<151; jter++){ P[iter-1][jter-1] = - gamma*(data->J[iter][jter]); } } -
Is there a best practice for matrix multiplication? In the loop below, Ith(v,iter) is a macro for getting the iter-th component of a vector held in the N_Vector structure ‘v’ (a data type used by the Sundials solvers). Particularly, is there a best way to get the dot product between v and the rows of J?
Jv_scratch = 0; int iter, jter; for (iter=1; iter<151; iter++) { for (jter=1; jter<151; jter++) { Jv_scratch += J[iter][jter]*Ith(v,jter); } Ith(Jv,iter) = Jv_scratch; Jv_scratch = 0; }
1) No they’re not you can memset the array as follows:
or you can use an array initialiser:
2) Well you are using a fairly complex calculation to calculate the position of P and the position of J.
You may well get better performance. by stepping through as pointers:
This way you move various of the array index calculation outside of the loop.
3) The best practice is to try and move as many calculations out of the loop as possible. Much like I did on the loop above.