I’ve used the following code in R to determine how well observed values (20, 20, 0 and 0 for example) fit expected values/ratios (25% for each of the four cases, for example):
> chisq.test(c(20,20,0,0), p=c(0.25, 0.25, 0.25, 0.25))
Chi-squared test for given probabilities
data: c(20, 20, 0, 0)
X-squared = 40, df = 3, p-value = 1.066e-08
How can I replicate this in Python? I’ve tried using the chisquare function from scipy but the results I obtained were very different; I’m not sure if this is even the correct function to use. I’ve searched through the scipy documentation, but it’s quite daunting as it runs to 1000+ pages; the numpy documentation is almost 50% more than that.
scipy.stats.chisquareexpects observed and expected absolute frequencies, not ratios. You can obtain what you want withAlthough in the case that the expected values are uniformly distributed over the classes, you can leave out the computation of the expected values:
The first returned value is the χ² statistic, the second the p-value of the test.