Homework Set 12, Due Wednesday October 26, 2016. ------------------------------------------------------ Assigned 10/17/2016, due 10/26 SIMULATION AND PARAMETER ESTIMATION FROM A MISSPECIFIED MODEL > HW12dat = scan("http://www.math.umd.edu/~slud/s705/Homework/HW12dat.txt", sep=",") ## 199 numeric vector elements > mu.hat = mean(log(HW12dat)) sigsq.hat = (198/199)*var(log(HW12dat)) > c(mu=mu.hat, sigsq=sigsq.hat) mu sigsq 3.0910060 0.3254111 > NlogL = function(theta) -sum(dnorm(log(HW12dat),theta[1], sqrt(theta[2]), log=T)) > tmpfit = optim(c(mu.hat,sigsq.hat), NlogL, control=list(reltol=1.e-10), hessian=T, method="BFGS") $par [1] 3.0910060 0.3254111 $value [1] 170.6635 $counts function gradient 12 1 $convergence [1] 0 $message NULL $hessian [,1] [,2] [1,] 6.115341e+02 -7.105427e-09 [2,] -7.105427e-09 9.396863e+02 ### A-matrix times n > Ainv = solve(tmpfit$hess) > Ainv [,1] [,2] [1,] 1.635232e-03 1.236479e-14 [2,] 1.236479e-14 1.064185e-03 ### Now the gradient of the n=199 logLikelihood terms log f(X_i,theta) ### (OK to do this for the data log(X_i), in which case this logLik term is ### -log(sig)-(0.5/sig^2)*(log X_i-mu)^2 ## taking gradient wrt mu, sig : > gradlogdens = rbind((1/sigsq.hat)*(log(HW12dat)-mu.hat), (1/sqrt(sigsq.hat))*(-1 + (log(HW12dat)-mu-hat)^2/sigsq.hat) > apply(gradlogdens,1,sum) [1] 1.020989e-13 7.469025e-14 ### this checks the grad logLik = 0 ## Next we calculate the B matrix tmp = c(rbind(gradlogdens[1,]^2, gradlogdens[1,]*gradlogdens[2,], gradlogdens[2,]^2) %*%rep(1,199)) > Bmat = array(c(tmp[1],rep(tmp[2],2),tmp[3]), c(2,2)) > Bmat [,1] [,2] [1,] 611.5341 -2659.068 [2,] -2659.0677 22315.380 > Ainv [,1] [,2] [1,] 1.635232e-03 1.236479e-14 [2,] 1.236479e-14 1.064185e-03 > Ainv %*% Bmat %*% Ainv [,1] [,2] [1,] 0.001635232 -0.004627281 [2,] -0.004627281 0.025271937 ### The situation is very clear: the hessian tmpfit$hess ### and Bmat are both essentially diagonal 2x2 matrices ### with same upper-left entry, but lower-right differ by factor > 0.02527/1.0642e-3 [1] 23.74554 ## This says that the "robustified" confidence interval for sigma is > sqrt(23.74554) [1] 4.87294 ### roughly a multiple of 5 ## times as wide as the naive one calculated as though the data ## were log-normal. IN CLASS, IRECALL SAYING I THOUGHT I HAD GENERATED THE DATA FROM A GAMMA DENSITY. BUT IN FACT, I GENERATED THEM FROM A NORMAL DISTRIBUTION (LOOK AT THE HISTOGRAM OF THE UNTRANSFORMED DATA AND YOU WILL SEE), WITH MEAN AND STANDARD ROUGHLY DEVIATION ROUGHLY 9 AND 24 RESPECTIVELY. I ORIGINALLY GENERATED 200 POINTS, BUT ONE WAS NEGATIVE AND I DISCARDED IT. >