I’m new to R, but I’m trying to estimate a missing value in a large microarray dataset using impute.knn() from library(impute) using 6 nearest neighbors.
Here’s an example:
seq1 <- seq(1:12)
mat1 <- matrix(seq1, 3)
mat1[2,2] <- "NA"
impute.knn(mat1, k=6)
I get the following error:
Error in knnimp.internal(x, k, imiss, irmiss, p, n, maxp = maxp) :
NA/NaN/Inf in foreign function call (arg 1)
In addition: Warning message:
In storage.mode(x) <- "double" : NAs introduced by coercion
I’ve also tried the following:
impute.knn(mat1[2,2], k=6)
and I get the following error:
Error in rep(1, p) : invalid 'times' argument
My google-fu has been off today. Any suggestions to why I might be getting this error?
edit: I’ve tried
mat1[2,2] <- NA
as James suggested, but I get a segmentation fault. Using
replace(mat1, mat1[2,2], NA)
does not help either. Any other suggestions?
I’m not sure why
impute.knnis set up the way it is, but the example within?impute.knnuseskhanmisswhich is adata.frameof factors, which when coerced tomatrixwill be character.You are getting a segmentation fault because you are trying to impute with
K > ncol(mat1)nearest neighbours. It might be worth reported a bug to the package authors, as this could easily be checked inRand return an error, not aClevel error which kills R.note
despite the strange example,
mat1will when it is integer or double as welltake home message
Don’t try to use impute using more information than you have.
Perhaps the package authors should take heed of
and build in some error checking so a simple error does not cause a segfault.