How do I select the first row of an R data frame that meets certain criteria?
Here is the context:
I have a data frame with five columns:
"pixel", "year","propvar", "component", "cumsum."
There are 1,225 combinations of pixel and year, because the data was computed from the annual time series of 49 geographic pixels for each of 25 study years. Within each pixel-year, I have computed propvar, the proportion of total variance explained by a given component of the fast Fourier transform for the time series of a given pixel-year. I then computed cumsum, which is the cumulative sum of propvar for each frequency component within a pixel-year. The component column just gives you an index for the Fourier series component (plus 1) from which propvar was calculated.
I want to determine the number of components required to explain greater than 99% of the variance. I figure one way to do this is to find the first row within each pixel-year where cumsum > 0.99, and create a data frame from it with three columns, pixel, year, and numbercomps, where numbercomps is the number of components required within a given pixel-year to explain greater than 99% of the variance. I do not know how to do this in R. Does anyone have a solution?
Sure. Something like this should do the trick:
EDIT Also, for those interested in
data.table, there is this: