I have a dataframe, Call it A, that looks something like this:
GroupID Dist1 Dist2 ...
1 4 4
1 5 4
1 3 16
2 0 4
2 7 2
2 8 0
2 6 4
2 7 4
2 8 2
3 7 4
3 5 6
...
GroupID is a factor, Dist1, Dist2 are integers.
I have a derived dataframe, SummaryA
GroupID AveD1 AveD2 ...
1 4 8
2 6 2
3 6 5
...
For each groupID, I need to find the ROW NUMBER that has the minimum, to do further manipulation, and to extract data to my summary set. For instance, I need:
GroupID MinRowD1
1 1
2 4
3 11
On matches, it doesn’t matter which I choose, but I’m stuck as to how I get this. I can’t use which(), because it doesn’t operate over factors nicely, I can’t use ave(Fun=min), because I need the location, not the minimum value.
If I do something with matching to the minimum for each group, I can have multiple matches, which screws stuff up.
Any suggestions for how to do this?
Here’s a base R solution; the basic idea is to split the data by GroupID, get the row with the minimum value for each, and then put it back together. Some think the
plyrfunctions are a more intuitive way to do this; I’m sure a solution using one of them will appear shortly…For large data sets,
splitis faster when performed on a scalar, not a data frame, like this.