I need some suggestions how to better design my problem’s resolution.
I starting from many Csv file of result of parametric study (time series data). I want to analyze the influence of some parameters on variable. The idea is to extract some variable from table of result for each id of parametric study and create a data.frame for each variable to easily make some plot and some analysis.
The problem is that some parameters change the time step of parametric study, so there are some csv much longer. One variable for example is Temperature. It is possible to maintain the differences on time step and evaluate Delta T varying one parameter? Plyr can do that? Or I have to resample part of my result to make this evaluation losing part of information?
I achieve to this point at moment:
head(data, 5)
names Date.Time Tout.dry.bulb RHout TsupIn TsupOut QconvIn[Wm2]
1 G_0-T_0-W_0-P1_0-P2_0 2005-01-01 00:03:00 0 50 23 15.84257 -1.090683e-14
2 G_0-T_0-W_0-P1_0-P2_0 2005-01-01 00:06:00 0 50 23 16.66988 0.000000e+00
3 G_0-T_0-W_0-P1_0-P2_0 2005-01-01 00:09:00 0 50 23 13.83446 1.090683e-14
4 G_0-T_0-W_0-P1_0-P2_0 2005-01-01 00:12:00 0 50 23 14.34774 2.181366e-14
5 G_0-T_0-W_0-P1_0-P2_0 2005-01-01 00:15:00 0 50 23 12.59164 2.181366e-14
QconvOut[Wm2] Hvout[Wm2K] Qradout[Wm2] MeanRadTin MeanAirTin MeanOperTin
1 0.0000 17.76 -5.428583e-08 23 23 23
2 -281.3640 17.76 -1.151613e-07 23 23 23
3 -296.0570 17.76 -1.018871e-07 23 23 23
4 -245.7001 17.76 -1.027338e-07 23 23 23
5 -254.8158 17.76 -9.458750e-08 23 23 23
> str(data)
'data.frame': 1858080 obs. of 13 variables:
$ names : Factor w/ 35 levels "G_0-T_0-W_0-P1_0-P2_0",..: 1 1 1 1 1 1 1 1 1 1 ...
$ Date.Time : POSIXct, format: "2005-01-01 00:03:00" "2005-01-01 00:06:00" "2005-01-01 00:09:00" ...
$ Tout.dry.bulb: num 0 0 0 0 0 0 0 0 0 0 ...
$ RHout : num 50 50 50 50 50 50 50 50 50 50 ...
$ TsupIn : num 23 23 23 23 23 23 23 23 23 23 ...
$ TsupOut : num 15.8 16.7 13.8 14.3 12.6 ...
$ QconvIn[Wm2] : num -1.09e-14 0.00 1.09e-14 2.18e-14 2.18e-14 ...
$ QconvOut[Wm2]: num 0 -281 -296 -246 -255 ...
$ Hvout[Wm2K] : num 17.8 17.8 17.8 17.8 17.8 ...
$ Qradout[Wm2] : num -5.43e-08 -1.15e-07 -1.02e-07 -1.03e-07 -9.46e-08 ...
$ MeanRadTin : num 23 23 23 23 23 23 23 23 23 23 ...
$ MeanAirTin : num 23 23 23 23 23 23 23 23 23 23 ...
$ MeanOperTin : num 23 23 23 23 23 23 23 23 23 23 ...
names(DF)
[1] "G_0-T_0-W_0-P1_0-P2_0" "G_0-T_0-W_0-P1_0-P2_1" "G_0-T_0-W_0-P1_0-P2_2"
[4] "G_0-T_0-W_0-P1_0-P2_3" "G_0-T_0-W_0-P1_0-P2_4" "G_0-T_0-W_0-P1_0-P2_5"
[7] "G_0-T_0-W_0-P1_0-P2_6" "G_0-T_0-W_0-P1_1-P2_0" "G_0-T_0-W_0-P1_1-P2_1"
[10] "G_0-T_0-W_0-P1_1-P2_2" "G_0-T_0-W_0-P1_1-P2_3" "G_0-T_0-W_0-P1_1-P2_4"
[13] "G_0-T_0-W_0-P1_1-P2_5" "G_0-T_0-W_0-P1_1-P2_6" "G_0-T_0-W_0-P1_2-P2_0"
[16] "G_0-T_0-W_0-P1_2-P2_1" "G_0-T_0-W_0-P1_2-P2_2" "G_0-T_0-W_0-P1_2-P2_3"
[19] "G_0-T_0-W_0-P1_2-P2_4" "G_0-T_0-W_0-P1_2-P2_5" "G_0-T_0-W_0-P1_2-P2_6"
[22] "G_0-T_0-W_0-P1_3-P2_0" "G_0-T_0-W_0-P1_3-P2_1" "G_0-T_0-W_0-P1_3-P2_2"
[25] "G_0-T_0-W_0-P1_3-P2_3" "G_0-T_0-W_0-P1_3-P2_4" "G_0-T_0-W_0-P1_3-P2_5"
[28] "G_0-T_0-W_0-P1_3-P2_6" "G_0-T_0-W_0-P1_4-P2_0" "G_0-T_0-W_0-P1_4-P2_1"
[31] "G_0-T_0-W_0-P1_4-P2_2" "G_0-T_0-W_0-P1_4-P2_3" "G_0-T_0-W_0-P1_4-P2_4"
[34] "G_0-T_0-W_0-P1_4-P2_5" "G_0-T_0-W_0-P1_4-P2_6"
From P1_4-P2_0 to P1_4-P2_6 the length is 113760 obs estand of 37920 because the time step change from 3 min to 1 min.
I’d like to have separated database for each variable in which I have date.time and value of variable for each names in column.
How I can do it?
Thank for any suggestion
I strongly suggest using a data structure that is appropriate for working with time series. In this case, the zoo package would work well. Load each CSV file into a zoo object, using your Date.Time column to define the index (timestamps) of the data. You can use the zoo() function to create those objects, for example.
Then use the merge function of zoo to combine the objects. It will find observations with the same timestamp and put them into one row. With merge, you can specify all=TRUE to get the union of all timestamps; or you can specify all=FALSE to get the intersection of the timestamps. For the union (all=TRUE), missing observations will be NA.
The read.zoo function could be difficult to use for reading your data. I suggest replacing your call to read.zoo with something like this:
(I assume that Date.Time is the first column in your file. That’s why I wrote table[,-1].)