I have 200 samples, each of them has 60 features. I use PCA to find the principal components. I use neural network and also try k nearest neighbor However, the classification results are not good. I don’t mind to take out some samples, but how I can tell which samples destroy my classification results? I know I can try them one by one, but it would be very ineffective. Please help
Share
Instead of throwing out some samples you need to throw out some attributes.
PCA computes a matrix with d x d entries. At 60 attributes, this matrix has 3600 entries. You have only 200 samples to compute the contents of this matrix – no wonder that the result is pretty much random. You need fewer variables and more data.