I’ve been searching for a command in R that will allow me to group by just a portion of what is in a field, rather than the entire field. I came up with a work around that works but it is a little cumbersome and clumsy. Here is a test data frame
name.list = data.frame(Name=c("jeff banks", "phil lender", "jeff brooks",
"barbara holcomb", "danny jefferson"),Age=c(27,34,25,45,32))
name.list
this is the output
Name Age
1 jeff banks 27
2 phil lender 34
3 jeff brooks 25
4 barbara holcomb 45
5 danny jefferson 32
I would like to identify all the Name entires that have “jeff” in them
so I can use that as a group or assign a dummy variable. In other words, append
to my data frame something like this:
Name Age Jeff.field
1 jeff banks 27 1
2 phil lender 34 0
3 jeff brooks 25 1
4 barbara holcomb 45 0
5 danny jefferson 32 1
I came up with this solution but it is not very elegant
name.list2=name.list[grep("jeff",name.list$Name),]
name.list2$jeff.field=rep(1,dim(name.list2)[1])
name.list3=name.list[-grep("jeff",name.list$Name),]
name.list3$jeff.field=rep(0,dim(name.list3)[1])
name.list4=rbind(name.list2,name.list3)
name.list4
This gets me this data frame
Name Age jeff.field
1 jeff banks 27 1
3 jeff brooks 25 1
5 danny jefferson 32 1
2 phil lender 34 0
4 barbara holcomb 45 0
Does anyone know of a more basic approach?
Here you go: