Using str(survey_OM) on my data frame indicates that all of my numerical data is atomic. If I use class(survey_OM$perc.OM) it returns numeric.
I have always thought that the second column of str showed the class of the data but it does not appear that simple… so my questions are:
- What is the second column of
strreporting? - What is
atomicand how does it differ fromnumeric? - Why in this case would the data appear as
atomicand notnumorint?
thank you.
dput(head(survey_OM, 20)) provides:
> dput(head(survey_OM, 20))
structure(list(lake = structure(c(3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("E-2",
"E-4", "E pond", "EX 1", "GTH 110", "GTH 112", "GTH 114", "GTH 156",
"GTH 91", "GTH 98", "N-1", "NE-10", "NE-11", "NE-3", "NE-8",
"NE-9", "NE-9b", "S-10", "S-11", "S-3", "S-6", "S-7"), class = "factor"),
date = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("2007/06/15",
"2007/06/18", "2007/06/19", "2007/06/20", "2007/06/21", "2007/06/27",
"2007/06/29", "2007/07/07", "2007/07/19", "2007/07/20", "2008/07/26",
"2008/07/30", "2008/08/04", "2008/08/06"), class = "factor"),
depth = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("E",
"epi", "H", "hypo"), class = "factor"),
depth.m = structure(c(6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L), .Label = c("", "10.9", "12.9", "1.5", "2",
"2.1", "2.2", "2.3", "2.4", "2.5", "2.6", "2.7", "3", "3.1",
"3.5", "4", "4.2", "4.8", "4.9", "5", "5.1", "5.5", "6",
"6.5", "7", "7.2", "9.9", "not recorded"), class = "factor"),
rep = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A",
"B", "C"), class = "factor"),
sed = c(0L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 0L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L),
notes = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("",
"col on SE side", "lg snail shell", "not collected", "very hard sediments"
), class = "factor"),
dry.mass = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0),
perc.OM = c(47.1300248455119, 47.4260808104607, 47.7349307375515, 46.4501104675465, 44.1513415737111, 43.5608499678045, 42.9921259842519, 42.2674677347574, 39.6643311064039,
39.0968130690949, 46.7768514928267, 46.9211608642763, 46.7877013177158,
47.0709930313588, 44.3241152581706, 43.7905468025952, 41.706074101281,
36.5061097383474, 37.4329041152142, 37.7757939038389)), .Names = c("lake",
"date", "depth", "depth.m", "rep", "sed", "notes", "dry.mass",
"perc.OM"), comment = c("working data frame of the sediment char from the 2007 sed survey created:", "Wed Apr 27 14:23:33 2011"), row.names = c(NA, 20L), class = "data.frame")
and the complete output of str(survey_OM) is:
> str(survey_OM)
'data.frame': 780 obs. of 9 variables:
$ lake : Factor w/ 22 levels "E-2","E-4","E pond",..: 3 3 3 3 3 3 3 3 3 3 ...
..- attr(*, "comment")= chr "names of the lakes"
$ date : Factor w/ 14 levels "2007/06/15","2007/06/18",..: 2 2 2 2 2 2 2 2 2 2 ...
..- attr(*, "comment")= chr "date that the cores were collected"
$ depth : Factor w/ 4 levels "E","epi","H",..: 2 2 2 2 2 2 2 2 2 2 ...
..- attr(*, "comment")= chr "relative depth ID; epi = shallowest corable Z, hypo = deepest Z, S, M, D = shallow, med, deep"
$ depth.m : Factor w/ 28 levels "","10.9","12.9",..: 6 6 6 6 6 6 6 6 6 6 ...
..- attr(*, "comment")= chr "depth that core was collected in m"
$ rep : Factor w/ 3 levels "A","B","C": 1 1 1 1 1 1 1 1 1 1 ...
..- attr(*, "comment")= chr "replicate ID for core"
$ sed : atomic 0 1 2 3 4 5 6 7 8 9 ...
..- attr(*, "comment")= chr "depth of sample from sed/water interface in cm"
$ notes : Factor w/ 5 levels "","col on SE side",..: 1 1 1 1 1 1 1 1 1 1 ...
..- attr(*, "comment")= chr "comments on sample"
$ dry.mass: atomic 0 0 0 0 0 0 0 0 0 0 ...
..- attr(*, "comment")= chr "dry mass of the sediment at that sed Z in g/m^2"
$ perc.OM : atomic 47.1 47.4 47.7 46.5 44.2 ...
..- attr(*, "comment")= chr "percent OM of the samp. based on LOI at 550d C"
- attr(*, "comment")= chr "working data frame of the sediment char from the 2007 sed survey created:" "Wed Apr 27 14:23:33 2011"
Looking at
utils:::str.default, we see that we get the usual output ofint,num, etc., if the followingifstatement is true:We get
atomicif this statement is false (and it would otherwise have beenint,num, etc).Looking at the help page for
is.vector, we see that it returns true only if it’s a vector with no attributes other than names. Here’s a data frame wherebhas an extra attribute:And calling
stron it givesatomicforbinstead ofint.I see in your edit that you have extra attributes on the elements of your data frame, and that you’re getting these extra lines about your attributes as well, so it would seem that this explains it.