I’ve written a script based on a for-loop to read in columns of multiple

Question

0

Asked: June 7, 20262026-06-07T02:06:04+00:00 2026-06-07T02:06:04+00:00

I’ve written a script based on a for-loop to read in columns of multiple

0

I’ve written a script based on a for-loop to read in columns of multiple .xls files, combine them to a single data frame, search for negative values and write a .txt file with these values and the name of the file.
The script works basically, but I have several hundred files to process, and it’s a bit slow. This version of the script is only a basic framework for later statistical analysis, and I want to parallelize the execution to speed it up.
I’ve tried to avoid the for-loop by applying the function via lapply and the plyr-package, but had problems passing the file list to “readWorkSheetFromFile” (Error in path.expand (filename) : invalid ‘path’ argument).

Here is the working script:

require(XLConnect)
setwd(choose.dir())

input = list.files(pattern = ".xls$")

# creates empty data frame 
df = data.frame(Name=NULL, PCr=NULL, bATP=NULL, Pi=NULL)

for(i in seq(along=input)){
    data = data.frame(readWorksheetFromFile(input[i], sheet="Output Data", 
    startRow=2, startCol=c(10, 13, 16), endCol=c(10, 13, 16), header=TRUE))

    head(data, n = -1L)

    colnames(data) = c("PCr", "bATP", "Pi")
    data$Name = file.path(input[i])

    attach(data)
    df = rbind(data, df)
    attach(df)
    rm(data)
}

# searches for negative values in df and writes to txt file 
neg_val = subset(df, bATP<0 | Pi<0 | PCr<0)
write.table(neg_val, file = "neg_val.txt", sep = "\t", quote=F)

Any solutions to this problem, or other suggestions to speed up execution?

Thanks,
Markus

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-07T02:06:06+00:00

I still don’t know why Martins code is not working on my data, but I’ve found another solution. It was about 4x faster in a first test than my original approach.

# load required packages
require(XLConnect)
# set working dir
setwd(choose.dir())

# creates list of files of chosen dir and all subdirectories
files = list.files(pattern = ".xls$", recursive=T, full.names=T)

data = do.call("rbind", lapply(files, function(fl) {
   # Read data from file
   data.tmp = data.frame(readWorksheetFromFile(file = fl, sheet="Output Data", 
                         startRow=2, startCol=c(10, 13, 16), 
                         endCol=c(10, 13, 16), header=TRUE))

  # deletes last row of data frame
  head(data.tmp, n = -1L)

  # add file names as column 
  data.tmp$File = file.path(fl)
  data.tmp
}))

# rename columns
colnames(data) = c("PCr", "bATP", "Pi", "File")
# list negative values 
neg.val = subset(data, bATP<0 | Pi<0 | PCr<0)
# write output file
write.table(neg.val, file = "neg_val.txt", sep = "\t", quote=F)

Thanks to all and best regards,
Markus

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’ve written a script based on a for-loop to read in columns of multiple

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply