Good afternoon. I am faced with a PCA task which simply involves reducing the

Question

0

Asked: June 15, 20262026-06-15T19:01:12+00:00 2026-06-15T19:01:12+00:00

Good afternoon. I am faced with a PCA task which simply involves reducing the

0

Good afternoon.

I am faced with a PCA task which simply involves reducing the dimensionality of a vector. I’m not interested in a two-dimensional matrix in this case, but merely a D-dimensional vector which I would like to project along it’s K principal eigenvectors.

In order to implement PCA, I need to retrieve the covariance matrix of this vector. Let’s try to do this on an example vector:

someVec = np.array([[1.0, 1.0, 2.0, -1.0]])

I’ve defined this vector as a 1 X 4 matrix, i.e a row vector, in order to make it compatible with numpy.cov. Taking the covariance matrix of this vector through numpy.cov will yield a scalar covariance matrix, because numpy.cov makes the assumption that the features are in the rows:

print np.cov(someVec)
1.58333333333

but this is (or rather, should be) merely a difference in dimensionality assumptions, and taking the covariance of the transpose vector should work fine, right? Except that it doesn’t:

print np.cov(someVec.T)
/usr/lib/python2.7/site-packages/numpy/lib/function_base.py:2005: RuntimeWarning:                  
invalid value encountered in divide
return (dot(X, X.T.conj()) / fact).squeeze()
[[ nan  nan  nan  nan]
[ nan  nan  nan  nan]
[ nan  nan  nan  nan]
[ nan  nan  nan  nan]]

I’m not exactly sure what I’ve done wrong here. Any advice?

Thanks,

Jason

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-15T19:01:13+00:00

If you want to pass in the transpose, you’ll need to set rowvar to zero.

In [10]: np.cov(someVec, rowvar=0)
Out[10]: array(1.5833333333333333)

In [11]: np.cov(someVec.T, rowvar=0)
Out[11]: array(1.5833333333333333)

From the docs:

rowvar : int, optional

If rowvar is non-zero (default), then each row
represents a variable, with observations in the columns. Otherwise,
the relationship is transposed: each column represents a variable,
while the rows contain observations.

If you want to find a full covariance matrix, you’ll need more than one observation. With a single observation, and numpy’s default estimator, NaN is exactly what you’d expect. If you would like to have normalization done by N instead of (N-1), you can pass in a 1 to the bias.

In [12]: np.cov(someVec.T, bias=1)
Out[12]:
array([[ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.]])

Again, from the docs.

bias : int, optional

Default normalization is by (N – 1), where N is
the number of observations given (unbiased estimate). If bias is 1,
then normalization is by N. These values can be overridden by using
the keyword ddof in numpy versions >= 1.5.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Good afternoon. I am faced with a PCA task which simply involves reducing the

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply