This question is almost the same as a previous question, but differs enough that the answers for that question don’t work here. Like @chase in the last question, I want to write out multiple files for each split of a dataframe in the following format(custom fasta).
#same df as last question
df <- data.frame(
var1 = sample(1:10, 6, replace = TRUE)
, var2 = sample(LETTERS[1:2], 6, replace = TRUE)
, theday = c(1,1,2,2,3,3)
)
#how I want the data to look
write(paste(">", df$var1,"_", df$var2, "\n", df$theday, sep=""), file="test.txt")
#whole df output looks like this:
#test.txt
>1_A
1
>8_A
1
>4_A
2
>9_A
2
>2_A
3
>1_A
3
However, instead of getting the output from the entire dataframe I want to generate individual files for each subset of data. Using d_ply as follows:
d_ply(df, .(theday), function(x) write(paste(">", df$var1,"_", df$var2, "\n", df$theday, sep=""), file=paste(x$theday,".fasta",sep="")))
I get the following output error:
Error in file(file, ifelse(append, "a", "w")) :
invalid 'description' argument
In addition: Warning messages:
1: In if (file == "") file <- stdout() else if (substring(file, 1L, :
the condition has length > 1 and only the first element will be used
2: In if (substring(file, 1L, 1L) == "|") { :
the condition has length > 1 and only the first element will be used
Any suggestions on how to get around this?
Thanks,
zachcp
There were two problems with your code.
First, in constructing the file name, you passed the vector
x$thedaytopaste(). Sincex$thedayis taken from a column of a data.frame, it often has more than one element. The error you saw waswrite()complaining when you passed several file names to itsfile=argument. Using insteadunique(x$theday)ensures that you will only ever paste together a single file name rather than possibly more than one.Second, you didn’t get far enough to see it, but you probably want to write the contents of
x(the current subset of the data.frame), rather than the entire contents ofdfto each file.Here is the corrected code, which appears to work just fine.