I have two large vectors:
A: https://dl.dropbox.com/u/22681355/A.csv
B: https://dl.dropbox.com/u/22681355/B.csv
A has over 20000 entries but only 1350 unique entries.
B is a random number generated from 1 to 9 exactly 1350 times
I would like to assign values from B to A such that the same values in A get the same values in B. e.g. if there are multiple 1’s each 1 should get the same number from B.
I have been using the A[B] command but after the 18000th entry I get NAs
What is the proper way of doing this?
code:
A<-read.csv("A.csv")
B<-read.csv("B.csv")
A[B]
read.csv()creates a data frame, not a vector.B[A]which for each element in A gets the value of B at the index of that element’s value. Since A’s values range from 1 to 1899 it exceeds B’s size of 1349. For those elements outside the bounds of B, NAs get introduced.The correct way to doing what you want to achieve is
match(A,levels(A))will return a vector of the same length as A that for each element contains the position of the element of A in its factor’s levels, i.e. a number between 1 and 1350 (1350 distinct values). If A wasas.factor(c(1,1,3,5,5,7)),levels(A)would bec(1,3,5,7)andmatch(A,levels(A))would bec(1,1,2,3,3,4), i.e. the position of the element in it’s levels.