Stat 705 HW 16 solution
=======================

cigcanc = read.table(
"http://www-users.math.umd.edu/~evs/s430.old/Data/cigcanc.dat", 
                    header=T)

fitmod = step(lm(LUNG ~ CIG, data=cigcanc), LUNG ~ CIG + 
             BLAD + KID + LEUK, k=3.84)
...
fitmod$call
lm(formula = LUNG ~ CIG + BLAD, data = cigcanc)

> c(fitmod$coef, summary(fitmod)$sig)
(Intercept)         CIG        BLAD             
  4.8901756   0.3516871   1.4561620   2.9299503 



(a) 
      rho0 = 1/2.93^2  ### = 0.1165
  ### Start with   rho ~ Gamma( 0.5, 0.5 ) , a=1, lambda=1, 
### for no particular reason. Let's take
mu0 = c(1,0,0); Sigma0 = diag(10,3,3)
### [Sigma0 eigenvalues are large, compared to 
###   the squares of coef/sig^2.]

> XtX = solve(summary(fitmod)$cov.unsc)
  sumY2 = sum(cigcanc$LUNG^2)
  sumXY = c(t(model.matrix(fitmod))%*% cigcanc$LUNG)
  Tv = sumXY + solve(Sigma0, mu0)

  RSS = sum(fitmod$resid^2)

### Recall from the BayesConjug.pdf handout that with p=3,
###   that the posterior density for rho is approximately 
###   Gamma(0.5*(n+a+p), 0.5*(lambda+RSS)) = Gamma(24, 176.48)
###   since 0.5*(1+RSS) = 176.4845.
###   This Gamma posterior is an extremely peaked density very 
###   close to Normal (24/176.5, 48/176.5^2) = Normal(.136, 0.0393^2),
###   and we can use either this Gamma or this Normal in 
###   figuring out further posterior-related integrals. A more 
###   accurate calculation of the posterior rho density using 
###   the exact formula given in the BayesConjug.pdf handout is 
###   a little difficult because of the need for rescaling in the 
###   integrals (to avoiid astronomically small normalizing constants,
###   and would not really imporve the final answers.

Then the posterior means for the coefficients can be obtained by 
integrating out rho in the posterior normal density for beta given 
rho multiplied by the rho posterior density:

## We integrate    over rho
### the mean of each of the three mean-components of  
###  N( (XtX + Sigma0^(-1)rho)^(-1) (Tv), rho^(-1)*(XtX + Sigma^(-1)rho)^(-1) )  
###    against   Gamma(24, 176.48)(rho) 

posmean = numeric(3)
for (i in 1:3) posmean[i] = 
    integrate( function(rho) {
                 outval = numeric(length(rho))
                 for (j in 1:length(rho)) outval[j]=
                   solve(XtX+diag(rho[j]*(1/c(10,3,3))),
                     Tv)[i]*dgamma(rho[j],24,176.48)
                 outval}, 0, Inf)$val
> posmean
[1] 4.9120339 0.3516577 1.4512191

### This Bayesian posterior mean should be compared with the least-squares
> fitmod$coef
(Intercept)         CIG        BLAD 
  4.8901756   0.3516871   1.4561620 

(b) We could do this as a posterior bivariate integral, but it 
is much easier to program as a Monte Carlo simulated probability:

tmpInd = numeric(1000)
beta = array(0,c(1000,3))
rho = rgamma(1000, 24, 176.48)
library(MASS)
for(i in 1:1000) {
    Mtmp = solve(XtX + diag(rho[i]*(1/c(10,3,3))))
    beta[i,] = mvrnorm(1, Mtmp %*% Tv, (1/rho[i])*Mtmp)
    tmpInd[i] = (beta[i,2] > 0.13056 & beta[i,2] < 0.57282 &
       beta[i,3] > 0.17891 & beta[i,3] < 2.73342)
    } 

> round( mean(tmpInd) + c(mean=0, LowerCL=-1, UpperCL=1)*
        1.96*sqrt(mean(tmpInd)*(1-mean(tmpInd))/1000) , 4)
    mean LowerCL UpperCL 
 0.9310  0.9153  0.9467

### So our point estimate is 0.9310 and 95% CI is (0.9153, 0.9467 )

### It makes sense that the joint probability is less that 0.95: but 
###  if the two coefficients were independent we would expect the 
###  probability to be close to .95^2 = .9025.
### In fact, the correlation between the two coefficents is high:
> cor(beta23[,1],beta23[,2])
[1] -0.6862057


(c) Finally, we now generate 1000 posterior datasets using the 
posterior simulated rho[i] and beta[i,] values, and get the 
corresponding absolute maximum standardized residuals.

maxsres = numeric(1000)
Xmod = model.matrix(fitmod)
resvar.unsc = diag(44) - Xmod %*% summary(fitmod)$cov.unsc %*% t(Xmod)
    ### this is the 44x44 matrix always used in obtaining variances of residuals
SEresid.unsc = sqrt(diag(resvar.unsc))
for(i in 1:1000)  {
   Ynew = c(Xmod %*% beta[i,])
   tmp = lm(Ynew ~ CIG + BLAD, data=cigcanc)
   res = tmp$resid
   maxsres[i] = max(abs(res)/SEresid.unsc)/summary(tmp)$sigma }

> hist(maxsres, nclass=24, main=
           "Posterior Predicted \n Maximum Std Resid")