Homework Set 13, Due Friday November 4, 2016. --------------------------------------------- Assigned 10/24/2016, due 11/2 [Extended to 6pm Friday 11/4] ============================== EM Algorithm for Estimating a Mixture Distribution Data values X_1,...,X_n are nonnegative independent random integers which have probability mass functions of the form a* dpois(x,lambda1) + (1-a)* dpois(x,lambda2) where a in (0,1) and 0 < lambda_1 < lambda_0 are unknown parameters. This means that each X_i can be viewed as depending on on unobserved "group-label" variable G_i = 0, 1 with P(G_i=1)=a, X_i ~ Poisson (lambda_1) given G_i=1 and X_i ~ Poisson (lambda_0) given G_i=0. (1) Generate a dataset of size n=300 under this model following set.seed(7757), and alpha=.3, lambda_1 = 2, lambda_2 = 3.5 Find the maximum likelihood estimates for (a,lambda_1,lambda_2) in two ways: (a) with a straightforward likelihood maximization, and (b) using the EM algorithm. In both cases use (.5, .1,.5) as starting values. (2) Repeat the same problem where the data now consist of n=60 clusters (X_{i,1}, ..., X_{i,5}) of 5 conditionally iid Poisson(lambda_{G_i}) random variables, where again the parameters are a = P(G_1=1), and lambda_1 < lambda_2. For both parts, report how many iterations your nlm or optim and EM algorithms take to converge, and confirm that the convergence is to the same parameter set in each problem part.