I found this code on internet that compares a normal distribution to different student distributions:
x <- seq(-4, 4, length=100)
hx <- dnorm(x)
degf <- c(1, 3, 8, 30)
colors <- c("red", "blue", "darkgreen", "gold", "black")
labels <- c("df=1", "df=3", "df=8", "df=30", "normal")
plot(x, hx, type="l", lty=2, xlab="x value",
ylab="Density", main="Comparison of t Distributions")
for (i in 1:4){
lines(x, dt(x,degf[i]), lwd=2, col=colors[i])
}
I would like to adapt this to my situation where I would like to compare my data to a normal distribution. This is my data:
library(quantmod)
getSymbols("^NDX",src="yahoo", from='1997-6-01', to='2012-6-01')
daily<- allReturns(NDX) [,c('daily')]
dailySerieTemporel<-ts(data=daily)
ss<-na.omit(dailySerieTemporel)
The objectif being to see if my data is normal or not… Can someone help me out a bit with this ? Thank you very much I really appreciate it !
If you are only concern about knowing if your data is normal distributed or not, you can apply the Jarque-Bera test. This test states that under the null your data is normal distributed, see details here. You can perform this test using
jarque.bera.testfunction.Clearly, from the result, you can see that your data is not normaly distributed since the null has been rejected even at 1%.
To see why your data is not normaly distributed you can take a look at the descriptive statistics:
From the last two rows, one can realize that
sshas an excess of kurtosis, and the skewness is not zero. This is the basis of the Jarque-Bera test.But if you are interested in compare actual distribution of your data agaist a normal distibuted random variable with the same mean and variance as your data, you can first estimate the empirical density function from your data using a kernel and then plot it, finally you only have to generate a normal random variable with same mean and variance as you data, do something like this:
In this fashion you can generate other curve from another probability distribution.
The tests suggested by @Alex Reynolds will help you if your interest is to know what possible distribution your data were drawn from. If this is your goal you can take a look at any goodness-of-it test in any statistics texbook. Nevertheless, if just want to know if your variable is normally distributed then Jarque-Bera test is good enough.