I have a data frame in r :
buys ges dif bin
1 22.34 12 10.34 0
2 55.56 12 43.56 0
3 78.33 12 66.33 0
4 9.99 12 2.01 1
.. .. .. .. ..
dif is just abs(buys-ges) and bin is an ifelse formula that is 1 if dif is <=10 and 0 otherwise. I’m trying to maximize the sum of the bin column by changing the ges column. The constraint is that ges is the same for all rows. I’ve tried a couple packages but can’t figure out maximizing or optimizing. Thanks for any suggestions.
buys <- rnorm(1:100)
> buys <- data.frame(a*100)
> buys <- round(abs(a), 2)
> summary(buys)
a...100 gs dif bin
Min. : 0.89 Min. :15 Min. : 1.76 Min. :0.00
1st Qu.: 38.29 1st Qu.:15 1st Qu.: 23.29 1st Qu.:0.00
Median : 72.89 Median :15 Median : 57.88 Median :0.00
Mean : 83.91 Mean :15 Mean : 70.52 Mean :0.13
3rd Qu.:123.50 3rd Qu.:15 3rd Qu.:108.50 3rd Qu.:0.00
Max. :269.11 Max. :15 Max. :254.11 Max. :1.00
> gs1 <- 5
> buys$gs <- gs1
> buys$dif <- abs(buys[,1] - buys$gs)
> buys$bin <- ifelse(buys$dif<=10,1,0)
> colnames(buys) <- c("buys","gs","dif","bin")
> head(buys)
buys gs dif bin
1 7.48 5 2.48 1
2 79.08 5 74.08 0
3 139.22 5 134.22 0
4 41.60 5 36.60 0
5 38.35 5 33.35 0
6 157.72 5 152.72 0
> sum(buys$bin)
[1] 10
> num_buys=function(x)
+ {
+ return(length(buys$buys[buys$buys>=x-10 | buys$buys<=x+10]))
+ }
> ans2 <- optimize(f=num_buys,interval=c(min(buys$buys),max(buys$buys)),maximum=TRUE)
>
>
> ans2
$maximum
[1] 269.1099
$objective
[1] 100
Since values of
binare either 0 or 1, for a given value ofges, we’re really just counting the number of elements inbuysthat are in the interval[ges-10,ges+10]. Visually, one could imagine “sliding” the interval[ges-10,ges+10]starting atges=min(buys)and ending atges=max(buys)and counting the number of entries ofbuysthat are in the interval as the value of a function. In particular:With that, we can use
optimizeto find a maximum:As an example:
So, in this case, a maximum value of
sum(bin)would be 6808, and this maximum would occur whenges=50.16788. Of course, this makes perfect sense, since about 68% of the values should occur within 10 units of 50 (normal distribution and all that). 😀