I’d like to classify the values of a data frame according to two columns.

Question

0

Asked: June 11, 20262026-06-11T08:52:27+00:00 2026-06-11T08:52:27+00:00

I’d like to classify the values of a data frame according to two columns.

0

I’d like to classify the values of a data frame according to two columns. Let’s say, I’ve got the following data frame:

my.df <- data.frame(a=c(1:20), b=c(61:80))

And now I want to subdivide it into 8 areas by dividing the 2D-scatterplot into 4 equal parts and then overlaying a rectangle in the middle that would consist of a quarter of each of the 4 parts. So far I’ve been using the following tedious way:

ar <- range(my.df$a)
br <- range(my.df$b)

aint <- seq(ar[1], ar[2], by=(ar[2]-ar[1])/4)
bint <- seq(br[1], br[2], by=(br[2]-br[1])/4)

my.df$z <- NA
my.df[which(my.df$a < aint[3] & my.df$b < bint[3]),"z"] <- 1
my.df[which(my.df$a < aint[3] & my.df$b >= bint[3]),"z"] <- 2
...
my.df[which(my.df$z == 1 & my.df$a >= aint[2] & my.df$b >= bint[2]),"z"] <- 5
...

I am sure there must be a way to do it in a neater and more general way, i.e. by writing a general function, but I am struggling to write one myself.

Also, I was surprised to see that after all of this, the class of the column z is automatically set to shingle. Why that? How does R “know” that this is a shingle?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-11T08:52:28+00:00

I’d approach it by cutting it into 16 groups first (x and y into 4 groups independently) and then combining them back together into fewer groups.

my.df$a.q <- cut(my.df$a, breaks=4, labels=1:4)
my.df$b.q <- cut(my.df$b, breaks=4, labels=1:4)
my.df$a.b.q <- paste(my.df$a.q, my.df$b.q, sep=".")
my.df$z <- c("1.1"=1, "1.2"=1, "1.3"=2, "1.4"=2, 
             "2.1"=1, "2.2"=3, "2.3"=4, "2.4"=2,
             "3.1"=5, "3.2"=6, "3.3"=7, "3.4"=8,
             "4.1"=5, "4.2"=5, "4.3"=8, "4.4"=8)[my.df$a.b.q]

This seems reasonable

plot(my.df$a, my.df$b, col=my.df$z)

With some data with more coverage:

set.seed(1234)
my.df <- data.frame(a=runif(1000, 1, 20), b=runif(1000, 61, 80))

enter image description here

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’d like to classify the values of a data frame according to two columns.

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply