I am using princomp in R to perform PCA. My data matrix is huge (10K x 10K with each value up to 4 decimal points). It takes ~3.5 hours and ~6.5 GB of Physical memory on a Xeon 2.27 GHz processor.
Since I only want the first two components, is there a faster way to do this?
Update :
In addition to speed, Is there a memory efficient way to do this ?
It takes ~2 hours and ~6.3 GB of physical memory for calculating first two components using svd(,2,).
You sometimes gets access to so-called ‘economical’ decompositions which allow you to cap the number of eigenvalues / eigenvectors. It looks like
eigen()andprcomp()do not offer this, butsvd()allows you to specify the maximum number to compute.On small matrices, the gains seem modest:
but the factor of three relative to
princomp()may be worth your while reconstructingprincomp()fromsvd()assvd()allows you to stop after two values.