I am trying to create a text-output of backup-durations sorted into 30-minute increment bins

Question

0

Asked: June 15, 20262026-06-15T19:24:12+00:00 2026-06-15T19:24:12+00:00

I am trying to create a text-output of backup-durations sorted into 30-minute increment bins

0

I am trying to create a text-output of backup-durations sorted into 30-minute increment bins for 6 of our backup servers. An example of the input data (called newdata) is as follows:

      backup_server   client      duration  
1     bkp01           server_A    60       
2     bkp01           server_A    34       
3     bkp01           server_A    230     
4     bkp02           server_A    14      
5     bkp02           server_C    29   
6     bkp02           server_C    62

Now I’ve been able to bin everything together with:

br.br <-seq(0,max(newdata$duration),by=30)
cbind(table(cut(newdata$duration,br.br,right=FALSE)))

Which provides this kind of output:

                    [,1]
[0,30)              3523
[30,60)             1394
[60,90)              230
[90,120)              35
[120,150)             10
[150,180)              0
[180,210)              3

What I’d like to see is something like this:

[,1]                bkp01      bkp02
[0,30)               523        422
[30,60)              394         30
[60,90)              130         10
[90,120)               5          3
[120,150)              1          2
[150,180)              0         10
[180,210)              2         20

The closest I got was using the aggregate function but doesn’t really do what I need.

> aggregate(newdata$Duration, by=list(newdata$TSM_server),FUN=mean)
  Group.1        x
1 bkp01       31.13307
2 bkp02       16.58491

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-15T19:24:13+00:00

If this is not what you want (and by comparing @joran’s solution to mine you should see that there is considerable ambiguity to be resolved regarding what summary measure is desired)….

 aggregate(newdata$Duration, 
           by=list(dur.cut=cut(newdata$duration,br.br,right=FALSE) , 
                   server=newdata$TSM_server),
            FUN=mean)

Then try this:

 tapply( newdata$Duration, 
           INDEX=list(dur.cut=cut(newdata$duration,br.br,right=FALSE) , 
                   server=newdata$TSM_server),
            FUN=mean)

Sometimes setting INDEX= interaction(var1, var2) produces slightly different and at times more desirable results. ( In testing these I do observe that the column names are different than your example.)

 aggregate(newdata$duration, 
            by=list(dur.cut=cut(newdata$duration,br.br,right=FALSE) , 
                    server=newdata$backup_server),
             FUN=mean)
#------------
  dur.cut server    x
1 [30,60)  bkp01 34.0
2 [60,90)  bkp01 60.0
3  [0,30)  bkp02 21.5
4 [60,90)  bkp02 62.0

 tapply( newdata$duration, 
            INDEX=list(dur.cut=cut(newdata$duration,br.br,right=FALSE) , 
                    server=newdata$backup_server),
             FUN=mean)
#-------------
           server
dur.cut     bkp01 bkp02
  [0,30)       NA  21.5
  [30,60)      34    NA
  [60,90)      60  62.0
  [90,120)     NA    NA
  [120,150)    NA    NA
  [150,180)    NA    NA
  [180,210)    NA    NA

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am trying to create a text-output of backup-durations sorted into 30-minute increment bins

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply