I have my list MF wich contains 105 lists. Every list, MF[[1]] MF[[2]] …. MF[[105]] contains different number of data frames. Thus, MF[[1]][[1]] exists but MF[[1]][[2]] is NULL because there is just one data frame for MF[[1]]. In the other hand MF[[2]] contains 15 different data frames, so MF[[2]][[1]] to MF[[2]][[15]] exists.
The colnames of all the data frames in every 105 list is:
[1] "Run" "Fecha" "Serie" "Patrimonio" "Ret Log Pat" "Percentil 5%" "Percentil Monto"
I’m gonna ask my question with a concrete example. Let’s use MF[[2]] wich contains 15 different data frames. Here are some headers of those data frames:
head(MF[[2]][[1]]):
Run Fecha Serie Patrimonio Ret Log Pat Percentil 5% Percentil Monto
31 8011 2002-08-18 1 4191689227 -0.456258862 -0.1973659 1305605031
32 8011 2002-08-19 1 4749171865 0.124866449 -0.2179453 913558775
33 8011 2002-08-20 1 5132656241 0.077653052 -0.2179453 1035059470
34 8011 2002-08-21 1 5088469783 -0.008646158 -0.2179453 1118638070
35 8011 2002-08-22 1 4998945148 -0.017750234 -0.2179453 1109007841
36 8011 2002-08-23 1 5449454077 0.086288515 -0.2179453 1089496372
head(MF[[2]][[2]])
Run Fecha Serie Patrimonio Ret Log Pat Percentil 5% Percentil Monto
31 8011 2006-05-09 100 6413583009 -0.0076314490 -0.07046562 455399234
32 8011 2006-05-10 100 6412446421 -0.0001772315 -0.07046562 451937105
33 8011 2006-05-11 100 6380254435 -0.0050328784 -0.07046562 451857014
34 8011 2006-05-12 100 6381112038 0.0001344061 -0.07046562 449588586
35 8011 2006-05-13 100 6381970402 0.0001345073 -0.07046562 449649018
36 8011 2006-05-14 100 6315827940 -0.0104180360 -0.07046562 449709503
head(MF[[2]][[3]])
Run Fecha Serie Patrimonio Ret Log Pat Percentil 5% Percentil Monto
31 8011 2002-08-18 2 3147993667 -0.0395416467 -0.03216529 105340167
32 8011 2002-08-19 2 3065335420 -0.0266083198 -0.03778848 118957901
33 8011 2002-08-20 2 3044946268 -0.0066737439 -0.03778848 115834372
34 8011 2002-08-21 2 3089802537 0.0146239300 -0.03778848 115063897
35 8011 2002-08-22 2 3090714960 0.0002952578 -0.03778848 116758947
36 8011 2002-08-23 2 3230667973 0.0442864759 -0.03778848 116793426
What I want is an iteration or whatever, that matches the column "Fecha" ( which means "Date" by the way), and if the Date matches, calculates the percent which represent each row of the column "Patrimonio" over the total sum of "Patrimonio" in which date matches.
Example given:
In this case we got:
head(MF[[2]][[1]]):
Run Fecha Serie Patrimonio Ret Log Pat Percentil 5% Percentil Monto
31 8011 2002-08-18 1 4191689227 -0.456258862 -0.1973659 1305605031
head(MF[[2]][[3]])
Run Fecha Serie Patrimonio Ret Log Pat Percentil 5% Percentil Monto
31 8011 2002-08-18 2 3147993667 -0.0395416467 -0.03216529 105340167
So, MF[[2]][[1]][1,2]==MF[[2]][[3]][1,2] ( Dates matches ), then I want a new column over each data frame like this:
New column for MF[[2]][[1]] = MF[[2]][[1]][1,4]/(MF[[2]][[1]][1,4]+MF[[2]][[3]][1,4]) = 4191689227/( 4191689227+ 3147993667) ( Percent Calculation over "Patrimonio" column )
New column for MF[[2]][[3]] = MF[[2]][[3]][1,4]/(MF[[2]][[1]][1,4]+MF[[2]][[3]][1,4]) = 3147993667/( 4191689227+ 3147993667) ( Percent Calculation over "Patrimonio" column )
The thing is that I must match all the 15 data frames to calculate the "Patrimonio" percent by the variable "Fecha" and so on for all the 105 lists. Hope my doubt is clear enough.
I can’t easily use your data due to the “5%” in the headers. However, you need to use the
applyfamily for the first step.will apply
yourfunctionto each element ofML. Since each element ofMLis a list also, you could lapply again (either inyourfunctionorlapply(MF, lapply, yourfunction).yourfunctionwill be something that works to do the calculation you want on a singledata.frame. I find it easiest to extract one from these nested structures and write a function that works for it. Then worry about applying it to all the members of nested lists.It sortof sounds like you’re wanting to compare the dates between data.frames. IF this is the case, your best bet is to combine them into a single frame rather than nested in a list.
You can do this in a few ways, but I like
plyr.Then the comparisons are much more straight forward.