Fall 2022
Instructor:Professor Eric Slud, Statistics Program, Math Dept., Rm 2314, x5-5469, slud@umd.edu
Office hours: M 1-2, F 10-11 (initially), or email me to make an appointment (can be on Zoom).
Overview: This course introduces mathematical statistics at a theoretical graduate level, using tools of advanced calculus and basic analysis. The framework is to define families of probability models for observed-data structures and explain the sense in which functions of observed-data random
variables can give a good idea of which of those probability models governed a particular dataset.
The course objective is to treat diverse statistically interesting models for data in a conceptually
unified way; to define mathematical properties that good procedures of statistical inference should
have; and to prove that some common procedures have them. Aspects of the theoretical results are
illustrated using demonstrations with statistical simulation.
Prerequisite: Stat 410 or equivalent. You should be comfortable (after review) with joint densities, (multivariate, Jacobian) changes of variable, moment generating functions, and conditional expectation, Central Limit Theorem and Law of Large Numbers, and mathematical analysis proofs at the level of Math 410-411.
Required Course Text: P. Bickel & K. Doksum, Mathematical Statistics, vol.I, 2nd ed., Pearson Prentice Hall, 2007.
Recommended Texts: (i) George Casella and Roger Berger
Statistical Inference, 2nd ed., Duxbury, 2002. A sheet of errata in Casella & Berger compiled by Jeffrey Hart of
Texas A&M Stat Dept can be found here.
Course Coverage: STAT 700 and 701 divide roughly with definitions and properties for finite-sample statistics in the Fall (STAT 700), and large-sample limit theory in the Spring (STAT 701). The division is
not quite complete, because we will motivate many topics (Point Estimation, Confidence Intervals, identifiability)
in terms of the Law of Large Numbers. The coverage in the Bickel & Doksum book for the Fall is roughly Chapters 1-4 along with related reading in Casella & Berger for special topics. Grading: There will be graded homework sets roughly every 1.5--2 weeks (6 or 7 altogether);
one in-class test, tenatively on Wed., Nov. 2; and an in-class Final Exam. The course grade will be
based 45% on homeworks, 20% on the in-class test, and 35% on the Exam. HONOR CODE The Fall 2022 Course Syllabus for Stat 700 is linked to this course web-page
(and also posted on the ELMS course pages). (I) You can see
sample test problems for the 1st in-class test, along with the
Fall 2009 In-Class Test and a set of
(II) Old homework problem assignments in Casella and Berger from Fall 2014 can be found
here. You can also see most of the solutions to these problems.
(III) I have a paper on the topic of
distributions related to the normal that are
or are not uniquely determined by their moments. (IV) The topic of mixture distributions and densities and their relation to
hierarchical specification of a distributional model and to distribution functions of
mixed type is elaborated in this handout. Here is an additional handout specifically on the identifiability of 2-component normal mixtures.
(V) A paper for a talk I gave at the 2013 Federal Committee on Statistical Methodology includes the idea of following on those policies (in this case, for callback of nonrespondents in the American Community Survey) that are on the admissible frontier with respect to a specified set of loss functions. This idea is related to the discussion of admissibility in SDection 1.3.3 of Bickel and Doksum.
(VI) A handout on conjugate priors for
a class of exponential family densities and probability mass functions.
(VII) Handout on Cramer-Rao (Information) Inequality to supplement what we did in class and what is done in Bickel and Doksum's Section 3.4.2.
(VIII) Handout on EM Algorithm from STAT 705.
(IX) Lecture Notes from Stat 705 on Numerical Maximization of
Likelihoods.
(X) Topics on Statistical Simulation: There are two sorts of handouts on Simulation methods
and (XI) Background on Markov Chain Monte Carlo: First see
Introduction and application of MCMC Homework: Assignments, including any changes and hints, will continually be posted here. The most current form of the assignment will be posted also on ELMS. You can find old homework assignments cumulatively added to this text-file and selected problem solutions in the directory HWslns/. HW 1 due Monday Sept.12, 11:59pm (upload to ELMS) (A) Suppose that i.i.d. real random variables X1,...,Xn are observed and can be assumed to follow one of the densities f(x,θ) from a family with real-valued unknown parameter θ. Suppose that there is a function r(x) such that R(θ) = ∫ r(x) f(x,θ) dx exists, is finite, and is strictly increasing in θ. Show that the parameter θ is identifiable from the data.
(B) In the setting of problem (A), explain (as constructively as possible) why there is a consistent (in probability) estimator gn(X1,...,Xn) of θ . Hint: Start from n-1 ∑1≤j≤n r(Xj) , and assume that R(θ) is continuous if you have to. An alternative assumption you may use instead is ∫ r2(x) f(x,θ) dx < ∞ for all θ.
(C) In the setting of i.i.d. vector-valued data Y1,...,Yn with vector-valued parameter θ ∈ Θ ⊂ ℝk, suppose that there exists a consistent (in probability) estimator gn(Y1,...,Yn) of θ.
Then show that θ is identifiable from the density family f(y,θ).
All 7 problems are to be handed in (uploaded) Monday Sept. 12 in ELMS. Read Chapter 1 Sections 1.2-1.3 of Bickel and Doksum and continue to review Appendix B.7.
In Bickel and Doksum, do problems # 1.2.2, 1.2.8, 1.2.12, 1.3.2, 1.3.3, 1.3.4(a) plus one additional problem:
(D) (a) Show that if a random K-vector v=(v1,...,vK) is Dirichlet(α) distributed, then v1 ~ Beta(α1, α2+...+αK). Read Chapter 1 Sections 1.4, 1.5 and 1.6.1 of Bickel and Doksum.
In Bickel and Doksum, do problems # 1.4.4, 1.4.12, 1.4.24, 1.5.4, 1.5.5, 1.5.14, 1.5.16 (and in 1.5.16, prove minimality).
For #1.4.4, to say Z is of "no value" in predicting Y would mean that P(Y ≥ t | Z) is free of Z for all t, or equivalently that Y is independent of Z. To solve 1.4.4, Read Chapter 1 Section 1.6 of Bickel and Doksum thoroughly. Also look at Sections 3.2-3.3 which will round out our coverage of decision theory before the in-class test on November 2.
In Bickel and Doksum, do the following problems from Bickel and Doksum pp.87-95: # 1.6.2, 1.6.10, 1.6.17, 1.6.28, and 1.6.35. Then also do and hand in the following 3 problems:
(E) For a Poisson(λ) sample find the UMVUE (Uniformly Minimum Variance Unbiased Estimator) of eλ/2.
(F) For a Poisson(λ) sample X1, ..., Xn with prior π(λ) ~ Gamma(3,1) for the parameter λ, find the Bayes estimator of eλ/2 with respect to mean-squared error loss, and show that the mean-squared errors of both of the estimators found in (E) and (F) (in a frequentist sense, not using the prior) are of order 1/n and differ from each other by an amount of order 1/n2.
(G) Suppose that the sample X1, ..., Xn of nonnegative-integer observations have the probability mass function p(k,θ) = θk (1-θ) I[k ≥0] for unknown parameter θ > 0. Find the UMVUE's of 1/(1-θ) and of θ based on the data sample of size n.
Hint: finding an unbiased estimator of each of these functions of θ as a function of a single observation X1 is a matter of identifying the coefficients of a power series in θ. Use the result of Bickel & Doksum problem 1.6.3 to do the conditional expectation calculation you need in this problem.
HW 5, due Friday 11/18/22 11:59pm (7 Problems)
Reading: Chapter 2 through Section 2.3, also Sections 2.4.2-2.4.3 and 3.4.2.
Do problems 2.2.11(b) (counts as 1/2 problem), 2.2.12, 2.2.21, 3.4.11 (counts as 1.5 problems), and 3.4.12, plus the following two extra problems:
(H) Let X1, ..., Xn be an iid sample from N(μ,1) and ψ(μ) = μ2. (a) Show that the minimum variance for any estimator of μ2 from this sample, according to the Cramer-Rao inequality, is 4 μ2/n. (b) Show that the UMVUE of μ2 is X̄2 - 1/n and that its variance is 4 μ2/n + 2/n2.
(I) Find by direct calculation the likelihood equation solved uniquely by the MLE of α based on a Gamma(α, 2) sample W1,...,Wn, and also show by direct calculation that this is the same equation satisfied by the method of moments estimator of α. Why does this follow from Exponential-Family theory ?
HW 6, due Monday 12/12/22 11:59pm (7.5 Problems)
Reading: Chapter 4 through Section 4.5.
In the Bickel & Doksum problems for Chapter 4, do 4.1.12 (counts as 1.5 problems), 4.2.2, 4.3.5, 4.3.7, 4.3.8, 4.3.10, 4.4.6.
HW7 is now cancelled. We will do these problems as part of STAT 701 in the Spring term.
Read Sections 4.5 and 4.9, and do Bickel and Doksum problems 4.4.14 and 4.5.3.
The most important of these errata are the ones on p.288, in Equation (6.2.7) and in line 14
(the last line
of Thm. 6.2.25): what is important as a sufficient condition for completeness is that the set of
"natural parameter"
values (η1,...,ηk) =
(w1(θ),w2(θ),...,wk(θ)) fills out an open set
in Rk as θ runs through all of Θ.
Homework will generally not be accepted late, and must be handed in as an uploaded pdf on ELMS.
(If you scan your handwritten papers or generate them using some other word-processor like Word or
LaTeX, then convert them to pdf before uploading.
The University of Maryland, College Park has a nationally recognized Code of Academic Integrity,
administered by the Student Honor Council. This Code sets standards for academic integrity at
Maryland for all undergraduate and graduate students. As a student you are responsible for
upholding these standards for this course. It is very important for you to be aware of the
consequences of cheating, fabrication, facilitation, and plagiarism. For more information on the
Code of Academic Integrity or the Student Honor Council, please visit http://www.shc.umd.edu.
To further exhibit your commitment to academic integrity, remember to sign the Honor Pledge on all
examinations and assignments:
"I pledge on my honor that I have not given or received any
unauthorized assistance on this examination (assignment)."
The guideline for the course on Homeworks is that you may get hints from each other or from me, but
that you must write up your solutions completely by yourself, without copying any parts of solutions
from each other.
Also: messages
and updates (such as corrections to errors in stated homework problems or changes in due-dates)
will generally be posted here, on this web-page, and also through emails in the course-mail account.
For further information (updated throughout the term) on the timing of individual lectures and tests,
click here and see the
Important
Dates below. For auxiliary reading in several useful handouts that are
described and linked below, click here.
Lecture-Topic Handouts
sample problems for
the in-class final. Also see further sample Problems and Topics for the
Fall 2014 1st In-Class Test,
and sample Problems and Topics for the
Fall 2014 2nd In-Class Test.
For the in-class test in Fall 2022, here are some Topics and Sample Problems for In-Class Test.
Here are lists of Important Topics and Sample Problems for the Final Exam on Saturday Dec.17.
The paper uses many of the techniques we review in Chapters 1 and 3.
Statistical Computing Handouts
interpretation. First, under this heading, there are 4 pdf writeups on Random Number Generation,
simulation, and interpretation of simulation experiments:
(i) Pseudo-random number generation,
(ii) Transformation of Random Variables,
(iii) Statistical Simulation ,
and if you want to read a little more on computational speedups in statistical
simulations, click here.
Topics (i) and (iv) were taken from my
web-pages for the course
STAT 705 on Statistical Computing in R.
Additional material on
statistical simulation for Bayesian MCMC is
discussed under heading (X) below.
within an EM estimation problem
in random-intercept logistic regression. For additional pdf files of
"Mini-Course" Lectures, including computer-generated figures, see Lec.1 on Metropolis-Hastings
Algorithm,
and Lec.2 on the
Gibbs Sampler, with Figures that can be found in
Mini-Course Figure Folders.
Read Chapters 1, Sec.1.1 and Appendices A. 10-A.14 and B.7 of Bickel and Doksum.
In Bickel and Doksum, do problems # 1.1.1(d), 1.2.(b)-(c), 1.1.15, and B.7.10, along with 3 additional problems:
HW 2, due Tuesday September 27, 11:59pm (7 problems total)
(b). Suppose that in 100 multinomial trials with 3 outcome categories and unknown category probabilities (p1, p2, p3) you observe respectively 37, 42, 21 outcomes in category 1, 2, 3. Assume that the prior density for the unknown (p1, p2) is proportional to p1 * p2, and find the prior and posterior probability that p3 > 0.3.
Hint: the probabilities in (b) are cdf's for the Beta distribution, also called incomplete Beta integrals (which you must multiply by a Beta function value). You can get them either from Tables (not so easy to find these days) or by a one-line invocation to the Beta distribution function pbeta in R or a similarly named function in your favorite computing language (Matlab, basic, python, ...)
HW 3, due Wednesday October 12, 11:59pm (7 problems total)
(a) Prove that sign(U1), U12 / (U12 + U22) and U12 + U22
are jointly independent random variables; and
(b) Show that the best predictor of Y = U1 with respect to mean-square or absolute error loss is 0, but also find a loss function for which the best predictor of Y is a nontrivial function of U1.
HW 4, due Saturday October 29, 11:59pm (8 problems total)
Important Dates
Return to my home page.
© Eric V Slud, Jan. 27, 2023.