My problem is the following:
I need to reduce a matrix cutting some columns away but keeping the names of column vectors.
DTM is my original matrix that looks like the following:
>DTM
word1 word2 word3 word4
[1] 1 1 0 0
[2] 2 0 1 1
[3] 0 1 0 2
and I want to obtain a new matrix (DTMr in the following chunk of code) that has ‘labels’ and eliminates all columns whose sum of members is less than a threshold (say 2):
word1 word4
[1] 1 0
[2] 2 1
[3] 0 2
>DTMr <- matrix(,nrow=nrow(DTM),ncol=d) # This should be the reduced matrix
where d is the number of columns of DTM that are larger than the threshold
>c = 1 # new counter
>for (col in 1:ncol(DTM))
>{
> if (sum(DTM[,col]) > 2)
> {
> DTMr[,c] = DTM[,col]
>
> c=c+1
> }
>}
Unfortunately in this way, DTMr is perfect, but it loses all labels (word 1, …word n).
Any ideas?
Claudio
A simple solution using subsetting and
colSums:Create some sample data:
Subset: