I am trying to do PCA on data frame with 5000 columns and 30 rows
Sample <- read.table(file.choose(), header=F,sep="\t")
Sample.scaled <- data.frame(apply(Sample,2,scale))
pca.Sample <- prcomp(Sample.scaled,retx=TRUE)`
Got the error
Error in svd(x, nu = 0) : infinite or missing values in 'x'
sum(is.na(Sample))
[1] 0
sum(is.na(Sample.scaled))
[1] 90
Tried to ignore all na values by using the following
pca.Sample <- prcomp(na.omit(Sample.scaled),retx=TRUE)
Which gives the following error
Error in svd(x, nu = 0) : 0 extent dimensions
There were reports that na.action requires formula to be given and hence tried the below
pca.Sample <- prcomp(~.,center=TRUE,scale=TRUE,Sample, na.action=na.omit)
Now getting the following error
Error in prcomp.default(x, ...) :
cannot rescale a constant/zero column to unit variance
Think that the problem might be because “One of my data columns is constant. The variance of a constant is 0, and scaling would then divide by 0, which is impossible.”
But not sure on how to tackle this. Any help much appreciated ….
Judging by the fact that
sum(is.na(Sample.scaled))comes out as90, whensum(is.na(Sample))was0, it looks like you’ve got three constant columns.Here’s a randomly generated (reproducible) example, which gives the same error messages:
You could try something like:
i.e. use
na.omiton the transpose to get rid of theNAcolumns rather than rows.