Homework Problem 10, STAT 705, Fall 2015. Assigned 10/14/2015, Due Monday 10/26/2015 (1) Generate a single dataset of values (X_{i,j}, Y_{i,j}) , i=1,...,1000, j=1,..,40, according to the following distributional rules: (*) epsilon_{ij} are iid ~ N(0,1), and independent of the epsilon's, X_{ij} are independent and ~ Unif[-2,2]*sqrt(1+0.2*j) and Y_{ij} = 0.4 + 0.1*j + X_{ij} + epsilon_{ij} Put your dataset Data1 (the X's and Y's) into a 1000 X 40 X 2 array. Generate a second 1000 X 40 X 2 dataset Data2 according to exactly the same method except that the values epsilon_{ij} are iid distributed according to t_5, Student's t distribution with 5 degrees of freedom. (2) For both of your datasets, view the j indices as "cluster" or "stratum" labels, and exhibit your empirically estimated cluster means and standard deviations -- compared across the two datasets -- in two informative graphs with x-axis corresponding to the j-index. (3) For both of your datasets simulated in (1), maximize the likelihood for the model Y_{ij} = a + mu*j + b*X_{ij} + sigma*Z_{ij} where X_{ij} as generated above ARE observed and part of your dataset, and where Z_{ij} ~ N(0,1) are NOT observed in your dataset. Here the unknown 4-dimensional statistical parameter is (a,mu,b,sigma). In your output, also give the estimated variance-covariance matrix for the jointly estimated parameters, and provide some indication that your likelihood maximization has converged. STORE YOUR SEEDS OR YOUR DATASETS FOR FUTURE REFERENCE: WE WILL USE THE SAME DATA BY A DIFFERENT METHOD IN AT LEAST ONE FUTURE PROBLEM SET. Provide your R code for all 3 parts, explaining (and where suitable, checking) what it does and explaining your outputs and what they mean.