I am writing a loop that takes two files per run e.g.a0.txt and b0.txt. I am running this over 100 files that run from a0.txt and b0.txt to a999.txt and b999.txt. The pattern function i use works perfect if i do the run for files a0 and b0 to a9 and b9 with only file pairs 0-9 in the directory. but when i put more files in the directory and do the run from ‘0:10, the loop fails and confuses vectors in files. I think this is becuase of thepattern` i use i.e.
list.files(pattern=paste('.', x, '\\.txt', sep=''))
This only looks for files that have '.',x,//txt.
So if '.'=a and x=1 it finds file a1. But i think it gets confused between a0 and a10 when I do the run over more files. But i cannot seem to find the appropriate loop that will serach for files that also look for files up to a999 and b999, as well.
Can anyone help with a better way to do this? code below.
dostuff <- function(x)
{
files <- list.files(pattern=paste('.', x, '\\.txt', sep=''))
a <- read.table(files[1],header=FALSE) #file a0.txt
G <- a$V1-a$V2
b <- read.table(files[2],header=FALSE) #file b0.txt
as.factor(b$V2)
q <- tapply(b$V3,b$V2,Fun=length)
H <- b$V1-b$V2
model <- lm(G~H)
return(model$coefficients[2],q)
}
results <- sapply(0:10,dostuff)
Error in tapply(b$V3, b$V2, FUN = length) : arguments must have same length
How about getting the files directly, without searching. i.e.
But the error message says the problem is caused by the call to
tapplyrather than anything about incorrect file names, and I have literally no idea how that could happen, since I thought a data frame (whichread.tablecreates) always has the same number of rows for each column. Did you copy-paste that error message out of R? (I have a feeling there might be a typo, and so it was, for example,q <- tapply(a$V3,b$V2,Fun=length). But I could easily be wrong)Also,
as.factor(b$V2)doesn’t modifyb$V2, it just returns a factor representingb$V2: after you callas.factorb$V2is still a vector. You need to assign it to something, e.g.: