I’m trying to calculate the number of pairwise differences between a long list of sequences, and put it back into a matrix form. This is a toy example of what I want to do.
library(MiscPsycho)
b <- c("-BC", "ACB", "---") # Toy example of sequences
workb <- expand.grid(b,b)
new <- c(1:9)
# Need to get rid of this for loop somehow
for (i in 1:9) {
new[i] <- stringMatch(workb[i,1], workb[i,2], normalize="NO")
}
workb <- cbind(workb, new)
newmat <- reShape(workb$new, id=workb$Var1, colvar=workb$Var2)
a <- c("Subject1", "Subject2", "Subject3") #Relating it back to the subject ID
colnames(newmat) <- a
rownames(newmat) <- a
newmat
I’m not very familiar with using the apply functions, but I’d like to use it to be able to replace the for loop, which will probably get slow considering I have a large number of sequences. (The stringMatch function is from MiscPsycho). Please let me know how to make it more efficient!
Thank you very much!
To get those “pairwise distances” I would have done something like: