Homework Problem Set 5, Due Friday September 26, 2016.
--------------------------------------------------------

Assigned 9/14/2016, due 9/26


(a) Use the following statements to generate a dataset:

> set.seed(75)
  Z = sample(1:4, 160, replace=T)
  X = rbeta(160,.6,.8)
  Y = 2 + 3*X + 0.2*(Z-2.5)*X + rchisq(160,3)

Plot and label a scatterplot of  of (X[i],Y[i]) pairs using these 
generated points, and using recognizably different plotting-characters 
for points associated with the 4 different values of Z.

Identify by index number (1:160)  on the plot all 6 points for 
which Y[i] > 12.

Make sure that your graph has suitable axis labels and a legend
with appropriate text showing the plotting character used for 
the points accordingh to their Z values.


(b) The optimizing functions   "optim"    and    "nlm"  
each allow the user to specify the function to be optimized 
(minimized or maximized) as having attributes "gradient" and 
"hessian", which are analytically defined functions that can 
be used directly in the iterative optimization steps. If not 
specified, both optimizing functions approximate thes first 
and second derivatives of the function at successive 
iteratively defined points.

Consider the maximization of the likelihood defined from the 
following simulated dataset

> set.seed(33333)
> Xvec = rbeta(200, 3.3, 1.4)

The negative log-likelihood function can be defined by

NlogL = function(theta,xv)  -sum(dbeta(xv,theta[1],theta[2],log=T))

Do the maximization  of this function over theta, with  xv  
fixed at Xvec, in 2 different ways:

(i) by using your choice of   nlm   or  optim (with meth="L-BFGS-B"),  
with the starting value  theta.ini = c(3.3, 1.4), without specifying 
attributes enabling the optimization function to use analytical 
gradients;  

and  

(ii) using the same optimizing function  (nlm  or  optim), with 
the same initial value, that you used in (i), but this time specifying 
your function to be optimized along with a "gradient" and maybe also 
a "hessian" attribute as indicated toward the end of the class R Log  
Sep12F16.RLog.

Does this make any difference at all to the time it takes to 
minimize, or to the result (the MLE) produced by the optimizing 
function ?

#>>>>>>>>>>>>>>>

Note added 9/19:  You can see a log illustrating some of the coding 
needed for this problem  under "Rlog.nlm.txt" within the Rlogs 
directory on the course web-page.


#----------------------------------------------------------------
EXTRA CREDIT, 2 points. Can you find an example of a minimization
problem where it makes a LOT of difference to the time used in 
minimization (in R, using one of the two optimtizing functons nlm 
or optim, with meth="L-BFGS-B") whether you supply analytical 
gradient and hessian attributes ?