I am creating a floating point matrix template class. The class declaration is shown below only with relevant functions and members.
// columns, rows
template <unsigned int c, unsigned int r>
class Matrix {
public:
Matrix(float value);
float& At(unsigned int x, unsigned int y);
float const& At(unsigned int x, unsigned int y) const;
template <unsigned int p> Matrix<p, r> MultipliedBy(Matrix<p, c> const& other);
private:
// column-major ordering
float data_[c][r];
}
The implementations for each of the above functions follow.
template <unsigned int c, unsigned int r>
Matrix<c, r>::Matrix(float value) {
std::fill(&data_[0][0], &data_[c][r], value);
}
template <unsigned int c, unsigned int r>
float& Matrix<c, r>::At(unsigned int x, unsigned int y) {
if (x >= c || y >= r) {
return data_[0][0];
}
return data_[x][y];
}
template <unsigned int c, unsigned int r>
float const& Matrix<c, r>::At(unsigned int x, unsigned int y) const {
if (x >= c || y >= r) {
return data_[0][0];
}
return data_[x][y];
}
template <unsigned int c, unsigned int r>
template <unsigned int p>
Matrix<p, r> Matrix<c, r>::MultipliedBy(Matrix<p, c> const& other) {
Matrix<p, r> result(0.0f);
for (unsigned int x = 0; x < c; x++) {
for (unsigned int y = 0; y < r; y++) {
for (unsigned int z = 0; z < p; z++) {
result.At(z, y) += At(x, y) * other.At(z, x);
}
}
}
return result;
}
Now, a few lines of test code.
Matrix<4, 4> m1;
// m1 set to
//
// 1 2 3 4
// 5 6 7 8
// 9 10 11 12
// 13 14 15 16
Matrix<1, 4> m2;
// m2 set to
//
// 6
// 3
// 8
// 9
Matrix<1, 4> m3 = m1.MultipliedBy(m2);
Here’s where things get weird. When compiled (using g++) with no optimization (-O0):
// m3 contains
// 0
// 0
// 0
// 0
With any optimization (-O1, -O2, or -O3):
// m3 contains
// 210
// 236
// 262
// 288
Note that even with the optimization, the answer is incorrect (verified with an external calculator). So I narrowed it down to this call in MultipliedBy:
Matrix<p, r> result(0.0f);
If I instantiate result in any way other becomes invalidated (all data_ values set to 0.0f). Before the allocation/initialization of result, other is still valid (6, 3, 8, 9).
It is worth noting that if I multiply two matrices of the same (square) dimension, I get a completely valid and correct output, regardless of the optimization level.
Anyone have a clue what in the world g++ is pulling here? I’m running g++ (GCC) 4.6.1 on mingw… might this have something to do with the problem?
&data_[c][r]is perhaps wrong: it’sdata_ + (c*r + r) * FS, whereas you perhaps need&data_[c-1][r-1] + FS, which isdata_ + ((c-1)*r + (r-1) + 1) * FS, which isdata_ + c*r * FS.(Here
FS == sizeof(float).)Your last item is
data_[c-1][r-1], so one past last would bedata_[c-1][r], notdata_[c][r].