I’m trying to make a hexbin representation of data in several categories. The problem is, facetting these bins seems to make all of them different sizes.
set.seed(1) #Create data
bindata <- data.frame(x=rnorm(100), y=rnorm(100))
fac_probs <- dnorm(seq(-3, 3, length.out=26))
fac_probs <- fac_probs/sum(fac_probs)
bindata$factor <- sample(letters, 100, replace=TRUE, prob=fac_probs)
library(ggplot2) #Actual plotting
library(hexbin)
ggplot(bindata, aes(x=x, y=y)) +
geom_hex() +
facet_wrap(~factor)

Is it possible to set something to make all these bins physically the same size?
As Julius says, the problem is that
hexGrobdoesn’t get the information about the bin sizes, and guesses it from the differences it finds within the facet.Obviously, it would make sense to hand
dxanddyto ahexGrob— not having the width and height of a hexagon is like specifying a circle by center without giving the radius.Workaround:
The
resolutionstrategy works, if the facet contains two adjacent haxagons that differ in both x and y. So, as a workaround, I’ll construct manually a data.frame containing the x and y center coordinates of the cells, and the factor for facetting and the counts:In addition to the libraries specified in the question, I’ll need
and also
bindata$factoractually needs to be a factor:Now, calculate the basic hexagon grid
Next, we need to calculate the counts depending on
bindata$factorAs we have the cell IDs, we can merge this data.frame with the proper coordinates:
Here’s what the data.frame looks like:
ggplotting (use the command below) this yields the correct bin sizes, but the figure has a bit weird appearance: 0 count hexagons are drawn, but only where some other facet has this bin populated. To suppres the drawing, we can set the counts there toNAand make thena.valuecompletely transparent (it defaults to grey50):yields the figure at the top of the post.
This strategy works as long as the binwidths are correct without facetting. If the binwidths are set very small, the
resolutionmay still yield too largedxanddy. In that case, we can supplyhexGrobwith two adjacent bins (but differing in both x and y) withNAcounts for each facet.An additional advantage of this approach is that we can delete all the rows with 0 counts already in
counts, in this case reducing the size ofhexdfby roughly 3/4 (122 rows instead of 520):The plot looks exactly the same as above, but you can visualize the difference with
na.valuenot being fully transparent.more about the problem
The problem is not unique to facetting but occurs always if too few bins are occupied, so that no "diagonally" adjacent bins are populated.
Here’s a series of more minimal data that shows the problem:
First, I trace
hexBinso I get all center coordinates of the same hexagonal grid thatggplot2:::hexBinand the object returned byhexbin:Set up a very small data set:
And plot:
I repeat the plot, leaving out data point 2:
note that the results from
hexbinare on the same grid (cell numbers did not change, just cell 5 is not populated any more and thus not listed), grid dimensions and ranges did not change. But the plotted hexagons did change dramatically.Also notice that
hgridcentforgets to return the center coordinates of the first cell (lower left).Though it gets populated:
Here, the rendering of the hexagons cannot possibly be correct – they do not belong to one hexagonal grid.