When we train a ctr(click through rate) model, sometimes we need calcute the real

Question

0

Asked: June 13, 20262026-06-13T12:15:24+00:00 2026-06-13T12:15:24+00:00

When we train a ctr(click through rate) model, sometimes we need calcute the real

0

When we train a ctr(click through rate) model, sometimes we need calcute the real ctr from the history data, like this


                 #(click)
    ctr   =  ----------------
              #(impressions)

We know that, if the number of impressions is too small, the calculted ctr is not real. So we always set a threshold to filter out the large enough impressions.

But we know that the higher impressions, the higher confidence for the ctr. Then my question is that: Is there a impressions-normalized statistic method to calculate the ctr?

Thanks!

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-13T12:15:25+00:00

You probably need a representation of confidence interval for your estimated ctr. Wilson score interval is a good one to try.

$Wilson score interval$

You need below stats to calculate the confidence score:

\hat p is the observed ctr (fraction of #clicked vs #impressions)
n is the total number of impressions
z_α/2 is the (1-α/2) quantile of the standard normal distribution

A simple implementation in python is shown below, I use z_(1-α/2)=1.96 which corresponds to a 95% confidence interval. I attached 3 test results at the end of the code.

# clicks      # impressions       # conf interval
2             10                  (0.07, 0.45)
20            100                 (0.14, 0.27)
200           1000                (0.18, 0.22)

Now you can set up some threshold to use the calculated confidence interval.

from math import sqrt

def confidence(clicks, impressions):
    n = impressions
    if n == 0: return 0
    z = 1.96 #1.96 -> 95% confidence
    phat = float(clicks) / n
    denorm = 1. + (z*z/n)
    enum1 = phat + z*z/(2*n)
    enum2 = z * sqrt(phat*(1-phat)/n + z*z/(4*n*n))
    return (enum1-enum2)/denorm, (enum1+enum2)/denorm

def wilson(clicks, impressions):
    if impressions == 0:
        return 0
    else:
        return confidence(clicks, impressions)

if __name__ == '__main__':
    print wilson(2,10)
    print wilson(20,100)
    print wilson(200,1000)

"""    
--------------------
results:
(0.07048879557839793, 0.4518041980521754)
(0.14384999046998084, 0.27112660859398174)
(0.1805388068716823, 0.22099327100894336)
"""

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

When we train a ctr(click through rate) model, sometimes we need calcute the real

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply