I am having problem developing intuition about the probabilistic interpretation of logistic regression. Specifically, why is it valid to consider the output of logistic regression function as a probability?
I am having problem developing intuition about the probabilistic interpretation of logistic regression. Specifically,
Share
Any type of classification can be seen as a probabilistic generative model by modeling the class-conditional densities
p(x|C_k)(i.e. given the classC_k, what’s the probability ofxbelonging to that class), and the class priorsp(C_k)(i.e. what’s the probability of classC_k), so that we can apply Bayes’ theorem to obtain the posterior probabilitiesp(C_k|x)(i.e. given x, what’s the probability that it belongs to classC_k). It is called generative because, as Bishop says in his book, you could use the model to generate synthetic data by drawing values ofxfrom the marginal distributionp(x).This all just means that every time you want to classify something into a specific class (e.g. size of a tumor being malignant of benign), there will be a probability of that being right or wrong.
Logistic regression uses a sigmoid function (or logistic function) in order to classify the data. Since this type of function ranges from 0 to 1, you can easily use it to think of it as probability distributions. Ultimately, you’re looking for
p(C_k|x)(in the example,xcould be the size of the tumor, and C_0 the class that represents benign and C_1 malignant), and in the case of logistic regression, this is modeled by:p(C_k|x) = sigma( w^t x )where
sigmais the sigmoid function,w^tis the transposed set of weightsw, andxis your feature vector.I highly recommend you read Chapter 4 of Bishop’s book.