I need frequent usage of matrix_vector_mult() which multiplies matrix with vector, and below is its implementation.
Question: Is there a simple way to make it significantly, at least twice, faster?
Remarks: 1) The size of the matrix is about 300×50. It doesn’t change during the
run. 2) It must work on both Windows and Linux.
double vectors_dot_prod(const double *x, const double *y, int n)
{
double res = 0.0;
int i;
for (i = 0; i < n; i++)
{
res += x[i] * y[i];
}
return res;
}
void matrix_vector_mult(const double **mat, const double *vec, double *result, int rows, int cols)
{ // in matrix form: result = mat * vec;
int i;
for (i = 0; i < rows; i++)
{
result[i] = vectors_dot_prod(mat[i], vec, cols);
}
}
This is something that in theory a good compiler should do by itself, however I made a try with my system (g++ 4.6.3) and got about twice the speed on a 300×50 matrix by hand unrolling 4 multiplications (about 18us per matrix instead of 34us per matrix):
I expect however the results of this level of micro-optimization to vary wildly between systems.