I’m trying to use cblas_sgemm to do fast matrix multiplication on two matrices of ints.
Right now it’s returning all zeroes.
I ran a quick naive matrix multiply to double check the expected output data and they are not supposed to be zeroes.
The working naive approach:
typedef int mm_data_t;
void func1( mm_data_t *in1, mm_data_t *in2, mm_data_t *out, int N ){
int i, j, k;
for(i=0; i<N; i++){
for(k=0; k<N; k++){
int temp = in1[i*N+k];
for(j=0; j<N; j++){
out[i*N+j] += temp * in2[k*N+j];
}
}
}
}
And using cblas_sgemm:
void func2( mm_data_t *in1, mm_data_t *in2, mm_data_t *out, int N ){
cblas_sgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, N, N, N, 1.0, (float*)in1, N, (float*)in2, N, 0.0, (float*)out, N);
}
I’m using one dimensional arrays for optimization.
Input data is black box’d but are constant.
cblas_sgemm()is designed to multiply matrices of single-precision floating point values, not integers.So your integers are being interpreted as floating point values. Small positive integers are likely to be treated as subnormal numbers. Multiplying any pair of these will have a result of zero. So if your inputs are all small non-negative integers, the outputs will be all zeros.
And if your inputs contain small negative integers, your outputs will probably contain a lot of NaNs, which will look like very large integers (which may be positive or negative.)
If you really need to multiply integers, you will need to convert them to & from floating point, or use a library that can multiply matrices of integers (BLAS cannot.)