Package:
Data:
- A rasterStack with 10 bands.
- Each of the bands contains an image area surrounded by NAs
- Bands are logical, i.e. “1” for image data and “0”/NA for surrounding area
- The “image areas” of each band do not align completely with each other, though most have partial overlaps
Objective:
- Write a fast function that can return either a rasterLayer or cell numbers for each “zone”, for instance a pixel containing data only from bands 1 and 2 falls in zone 1, a pixel containing data only from bands 3 and 4 falls in zone 2, etc. If a rasterLayer is returned, I need to be able to match the zone value with band numbers later.
First attempt:
# Possible band combinations
values = integer(0)
for(i in 1:nlayers(myraster)){
combs = combn(1:nlayers(myraster), i)
for(j in 1:ncol(combs)){
values = c(values, list(combs[,j]))
}
}
# Define the zone finding function
find_zones = function(bands){
# The intersection of the bands of interest
a = subset(myraster, 1)
values(a) = TRUE
for(i in bands){
a = a & myraster[[i]]
}
# Union of the remaining bands
b = subset(myraster, 1)
values(b) = FALSE
for(i in seq(1:nlayers(myraster))[-bands]){
b = b | myraster[[i]]
}
#plot(a & !b)
cells = Which(a & !b, cells=TRUE)
return(cells)
}
# Applying the function
results = lapply(values, find_zones)
My current function takes a very long time to execute. Can you think of a better way? Note that I don’t simply want to know how many bands have data at each pixel, I also need to know which bands. The purpose of this is to process different the areas differently afterwards.
Note also that the real-life scenario is a 3000 x 3000 or more raster with potentially more than 10 bands.
EDIT
Some sample data consisting of 10 offset image areas:
# Sample data
library(raster)
for(i in 1:10) {
start_line = i*10*1000
end_line = 1000000 - 800*1000 - start_line
offset = i * 10
data = c(rep(0,start_line), rep(c(rep(0,offset), rep(1,800), rep(0,200-offset)), 800), rep(0, end_line))
current_layer = raster(nrows=1000, ncols=1000)
values(current_layer) = data
if(i == 1) {
myraster = stack(current_layer)
} else {
myraster = addLayer(myraster, current_layer)
}
}
NAvalue(myraster) = 0 # You may not want to do this depending on your solution...

EDIT : Answer updated using Nick’s trick and matrix multiplication.
You could try the following function, optimized by using Nick’s trick and matrix multiplication. The bottleneck now is filling up stack with the seperate layers, but I guess the timings are quite OK now. Memory usage is a bit less, but given your data and the nature of R, I don’t know if you can nibble of a bit without hampering the performance big time.
The function returns either a rasterStack with the different level combinations present in the plot (that’s not all possible level combinations, so you have some gain there already), or a matrix with the level number and level names. This allows you to do something like :
to get the level names for each cell point. As shown below, you can easily put that matrix inside a rasterLayer object.
The function :
Proof of concept, using the data of RobertH :