HW17 Solutions In HW12, you previously analyzed the dataset "Insurance" from the MASS package using linear regression techniques.Now we will consider the data as being subject to a Poisson model, Claims ~ Poisson(Holders * lambda) (*) (A) In this model, logLik = C - lambda*sum(Holders) + sum(Claims*log(lambda) where C depends on data Holders, Claims but no lambda. Then the formula for the MLE is > lam.ML = sum(Insur$Claims)/sum(Insur$Holders) > lam.ML [1] 0.1348945 ## Posterior density is Gamma(1+sum(Insur$Claims), 0.1+sum(Insur$Holders). ---------------------- (B) Pseudo-Dataset simulation methods ## (1) > PseudoY1 = array( rpois(1000*64, lam.ML*c(outer(rep(1,1000), Insur$Holders))), c(1000,64)) ## (2) > lamvec = rgamma(1000,1+sum(Insur$Claims), 0.1+sum(Insur$Holders)) PseudoY2 = array( rpois(1000*64, c(outer(lamvec,Insur$Holders))), c(1000,64)) ## (3) > indvec = array(sample(1:64, 1000*64, replace=T), c(1000,64)) PseudoX3 = array(Insur$Holders[c(indvec)], c(1000,64)) PseudoY3 = array( rpois(1000*64, lam.ML*c(PseudoX3)), c(1000,64)) ## (4) use PseudoX3 as the array of X=Holders values in this part > PseudoY4 = array(Insur$Claims[c(indvec)], c(1000,64)) ----------------------- (C) Find 90% two-sided confidence intervals for lambda based on the four methods in (B). ## (1) > CI1 = lam.ML - quantile(apply(PseudoY1,1,sum)/sum(Insur$Holders)-lam.ML, probs=c(0.95,0.05)) ## In this part, a confidence interval based on large-sample normality ## of the pseudo-sample MLEs is not a bad idea, but is probably ## not quite as good. ## (2) > CI2 = qgamma(c(.05,.95), 1+sum(Insur$Claims), 0.1+sum(Insur$Holders)) ## OR, the "empirical" version of this without using the explicit qgamma function > CI2B = quantile(lamvec, probs=c(.05, .95)) ## (3) "Parametric Bootstrap percentile" > CI3 = lam.ML - quantile(apply(PseudoY3,1,sum)/apply(PseudoX3,1,sum)-lam.ML, probs=c(0.95,0.05)) ## (4) "Nonparametric Bootstrap percentile" > CI4 = lam.ML - quantile(apply(PseudoY4,1,sum)/apply(PseudoX3,1,sum)-lam.ML, probs=c(0.95,0.05)) ## OR ELSE the studentized version using as the consistent estimator of the standard deviation of the MLE ratio-estimator sd(Y-lamhat*X)/sum(X) > MLboot4 = apply(PseudoY4,1,sum)/apply(PseudoX3,1,sum) CI4B = lam.ML - ((sd(Insur$Claims-lam.ML*Insur$Holders)/sum(Insur$Holders))* quantile( (MLboot4-lam.ML)/(apply(PseudoY4 - MLboot4*PseudoX3,1,sd)/apply(PseudoX3,1,sum)), probs=c(.95,.05)) ) ## Simultaneous comparison of all of these intervals: round(array(rbind(CI1,CI2,CI2B,CI3,CI4,CI4B), c(6,2), dimnames=list( c("ParBoot","Bayes.exact", "Bayes.empir", "NP-Par.Boot", "NPboot.Pct","NPboot.Stud"), c("Lower","Upper"))), 5) Lower Upper ParBoot 0.13117 0.13875 Bayes.exact 0.13101 0.13891 Bayes.empir 0.13095 0.13898 NP-Par.Boot 0.13089 0.13897 NPboot.Pct 0.12101 0.14401 NPboot.Stud 0.12232 0.14636