The truncated normal is given by:
dtnorm<- function(x, mean, sd, a, b) {
dnorm(x, mean, sd)/(pnorm(b, mean, sd)-pnorm(a, mean, sd))
}
ptnorm <- function(x, mean, sd, a, b) {
(pnorm(x,mean,sd) - pnorm(a,mean,sd)) /
(pnorm(b,mean,sd) - pnorm(a,mean,sd))
}
The fit is given by:
fitdist( data, tnorm, method="mle",
start=list(mean=mapply("[[", results[1], 1),
sd=mapply("[[", results[1], 2)),
fix.arg=list(a=minLoose,b=maxLoose))
Where results[i] is a matrix with the mle results of fitdist using normal instead of tnormal.
I get the following results for tnorm:
mean=-0.00844725266454969, sd=0.012540928272073
whereas with norm:
mean=0.00748402597402597, sd=0.00614293813955003
The data is all larger than 0 and smaller than 0.04 so the mle obtained for tnorm does not seem right…. Any advise?
Thanks!
The fact that your data is all above normal (er, rather above 0) has little bearing on whether the “mean” of best fit to a truncated distribution does or doesn’t exceed 0. You are fitting a right tail of a Normal distribution to your data. The estimated location parameter for the truncated is not really a mean, but rather where the mean would be in an uncensored dataset with a right tail of the same density “shape” as your data. (This is really a stats question rather than an R question.)
You can find the formula to calculate the expected value of a doubly truncated Normal at the moments section of the Wikipedia article:
http://en.wikipedia.org/wiki/Truncated_normal_distribution It is readily translatable into calls to
pnormandqnorm.A further thought: Check out the facilities for working with truncated distributions in packages: ‘gamlss’ and ‘gamlss.tr’.