Statistics 700 Mathematical Statistics I



Fall 2024 MWF 9-9:50am,    PHYS 2124

Instructor: Professor Eric SludStatistics Program,  Math Dept.,   Rm 2314, x5-5469,  slud@umd.edu

Office hours: M 10, F 11 (initially), or email me to make an appointment (can be on Zoom).
TA/Grader in the course is Amandeep Chanda, who will hold regular office hours Mondays 2:30-4 for students to ask questions about lectures and HW problems.

Lecture Handouts Statistical Computing Handouts Homework Syllabus

Overview: This course introduces mathematical statistics at a theoretical graduate level, using tools of advanced calculus and basic analysis. The framework is to define families of probability models for observed-data structures and explain the sense in which functions of observed-data random variables can give a good idea of which of those probability models governed a particular dataset. The course objective is to treat diverse statistically interesting models for data in a conceptually unified way; to define mathematical properties that good procedures of statistical inference should have; and to prove that some common procedures have them. Aspects of the theoretical results are illustrated using demonstrations with statistical simulation. The fall term (Stat 700) will primarily emphasize definitions, concepts and finite-sample optimality properties of estimators and tests, while the spring term (Stat 701) emphasizes large-sample asymptotic theory and more modern topics.

Prerequisite: Stat 410 or equivalent. You should be comfortable (after review) with joint densities, (multivariate, Jacobian) changes of variable, moment generating functions, and conditional expectation, Central Limit Theorem and Law of Large Numbers, and mathematical analysis proofs at the level of Math 410-411. The first HW set, due 9/4/24, will have several problems at the level of the prerequisite review material.

Required Course Text:   P. Bickel & K. Doksum, Mathematical Statistics, vol.I, 2nd ed., Pearson Prentice Hall, 2007.

Recommended Texts:   (i)   George Casella and Roger Berger Statistical Inference,   2nd ed., Duxbury, 2002.
(ii)   V. Rohatgi and A.K. Saleh, An Introduction to Probability and Statistics, 2nd ed., Wiley, 2001. for practice problems
(iii)   Jun Shao, Mathematical Statistics, 2nd ed., Springer, 2003.      for more theoretical material
(iv)   P. Billingsley, Probability and Measure, 2nd (1986) or later edition, Wiley. for deeper probability background

A sheet of errata in Casella & Berger compiled by Jeffrey Hart of Texas A&M Stat Dept can be found here.
The most important of these errata are the ones on p.288, in Equation (6.2.7) and in line 14 (the last line
of Thm. 6.2.25): what is important as a sufficient condition for completeness is that the set of "natural parameter"
values   (η1,...,ηk) = (w1(θ),w2(θ),...,wk(θ))   fills out an open set in   Rk   as   θ   runs through all of   Θ.

Course Coverage: STAT 700 and 701 divide roughly with definitions and properties for finite-sample statistics in the Fall (STAT 700), and large-sample limit theory in the Spring (STAT 701). The division is not quite complete, because STAT 700 motivates many topics (Point Estimation, Confidence Intervals, identifiability) in terms of the Law of Large Numbers. The coverage in the Bickel & Doksum book for the Fall is roughly Chapters 1-4 along with related reading in Casella & Berger for special topics.
We begin with an overview of statistical data structure, models and formal definition of statistics in Chapter 1 (Secs. 1.1.1-1.1.3.) Succeeding lectures will review standard background material on probability and standard distributions (Appendix A, especially sections A.10-A.14) in order to set up later material on Exponential Families (Section 1.6). Brief review of basic statistical definitions will be done from the viewpoint of Decision Theory in Section 1.3. Introduction of the Bayesian viewpoint on statistical inference is naturally done in that context, and we will cover some of the Bayesian mechanics in Section 1.2. The other important material in chapter 1 concerns the notion of "sufficient statistics" and "prediction" versus "estimation".
Chapter 2 covers the main estimation techniques, (generalized) method of moments, maximum likelihood, and Estimating Equations as a way to unify these two different-seeming methods in a general framework. Computational topics (algorithms, including numerical maximization and EM) for the solution of Maximum Likelihood and Estimating Equation problems are also covered in Chapter 2. Chapter 3 discusses notions of performance quality and optimality for statistical estimation procedures, while Chapter 4 introduces basic ideas and optimality principles related to hypothesis testing.
Readings in Casella and Berger will be occasional and topic-based. Some introductory Bayesian topics will be covered there, and basics on MCMC may also be discussed as part of Chapters 1 and 2 of Bickel and Doksum augmented by pdf handouts.

Grading: There will be graded homework sets roughly every 1.5 weeks (7 or 8 altogether); one in-class test, tenatively on Wed., Oct. 23; and an in-class Final Exam. The course grade will be based 45% on homeworks, 20% on the in-class test, and 35% on the Exam.
Homework will generally not be accepted late, and must be handed in as an uploaded pdf on ELMS. (If you scan your handwritten papers or generate them using some other word-processor like Word or LaTeX, then please convert them to pdf before uploading.) Throughout the term, partial problem set solutions will be posted to the ELMS course pages.

HONOR CODE

The University of Maryland, College Park has a nationally recognized Code of Academic Integrity, administered by the Student Honor Council. This Code sets standards for academic integrity at Maryland for all undergraduate and graduate students. As a student you are responsible for upholding these standards for this course. It is very important for you to be aware of the consequences of cheating, fabrication, facilitation, and plagiarism. For more information on the Code of Academic Integrity or the Student Honor Council, please visit http://www.shc.umd.edu.

To further exhibit your commitment to academic integrity, remember to sign the Honor Pledge on all examinations and assignments:
"I pledge on my honor that I have not given or received any unauthorized assistance on this examination (assignment)."

The guideline for the course on Homeworks is that you may get hints from each other or from me, but that you must write up your solutions completely by yourself, without copying any parts of solutions from each other.


The Fall 2024 Course Syllabus for Stat 700 is linked to this course web-page (and also posted on the ELMS course pages).
Also: messages and updates (such as corrections to errors in stated homework problems or changes in due-dates) will generally be posted here, on this web-page, and also through emails in the course-mail account.

For further information (updated throughout the term) see the Important Dates below.
For auxiliary reading in several useful handouts that are described and linked below, click here.


Lecture-Topic Handouts

(I)    You can see sample test problems for the 1st in-class test, along with the Fall 2009 In-Class Test and a set of
sample problems for the in-class final. Also see further sample Problems and Topics for the Fall 2014 1st In-Class Test.
For the in-class test in Fall 2024, here are some Topics and Sample Problems for In-Class Test.
Here are lists of Important Topics and Sample Problems for the Fall 2022 Final Exam.

(II)    Old homework problem assignments in Casella and Berger from Fall 2014 can be found here. You can also see most of the solutions to these problems.

(III)    I have a paper on the topic of distributions related to the normal that are or are not uniquely determined by their moments. (Hard-copy reprints available by request.) The paper uses many of the techniques we review in Chapters 1 and 3.

(IV)   The topic of mixture distributions and densities and their relation to hierarchical specification of a distributional model and to distribution functions of mixed type is elaborated in this handout. Here is an additional handout specifically on the identifiability of 2-component normal mixtures.

(V)   A paper for a talk I gave at the 2013 Federal Committee on Statistical Methodology includes the idea of following on those policies (in this case, for callback of nonrespondents in the American Community Survey) that are on the admissible frontier with respect to a specified set of loss functions. This idea is related to the discussion of admissibility in SDection 1.3.3 of Bickel and Doksum.

(VI)   A handout on conjugate priors for a class of exponential family densities and probability mass functions.

(VII)   Handout on Cramer-Rao (Information) Inequality to supplement what we did in class and what is done in Bickel and Doksum's Section 3.4.2.

(VIII)   Handout on EM Algorithm from STAT 705.

(IX)   Lecture Notes from Stat 705 on Numerical Maximization of Likelihoods.


Statistical Computing Handouts

(X) Topics on Statistical Simulation: There are two sorts of handouts on Simulation methods and
interpretation. First, under this heading, there are 4 pdf writeups on Random Number Generation,
simulation, and interpretation of simulation experiments: (i) Pseudo-random number generation,
(ii) Transformation of Random Variables,
(iii) Statistical Simulation ,
and if you want to read a little more on computational speedups in statistical simulations, click here.
Topics (i) and (iv) were taken from my web-pages for the course STAT 705 on Statistical Computing in R.
Additional material on statistical simulation for Bayesian MCMC is discussed under heading (X) below.

(XI) Background on Markov Chain Monte Carlo: First see Introduction and application of MCMC
within an EM estimation problem in random-intercept logistic regression. For additional pdf files of
"Mini-Course" Lectures, including computer-generated figures, see Lec.1 on Metropolis-Hastings Algorithm,
and Lec.2 on the Gibbs Sampler, with Figures that can be found in Mini-Course Figure Folders.



Homework: Assignments, including any changes and hints, will continually be posted here. The most current
form of the assignment will be posted also on ELMS. The HW problems assigned in Fall 2022 are listed here.

HW 1 due Wednesday Sept.4, 11:59pm (upload to ELMS) LaTeX'ed pdf file here.
Readings to do during this time: Chapters 1, Sec.1.1 and Appendices A. 10-A.14 and B.7 of Bickel and Doksum.

HW 2 due Wednesday Sept.18, 11:59pm (upload to ELMS)      Problems to solve include 1.1.3, 1.1.10, 1.5.5, 1.5.16
plus two more, given in LaTeX'ed pdf file here. Readings to do during this time: Chapter 1, Sec.1.1.3, 1.1.4, 1.5,
and Appendices A.14-A.15 of Bickel and Doksum, and Casella and Berger Sec.6.2.


Note added on 9/13: Problem 3 in HW2 has been modified. In the desired representation
h(E(g1(X1)), E(g2(X1)), ..., E(gk(X1))), it is important that h is continuous, and there is no harm in also requiring it
to be smooth (continuously differentiable), but the functions g1, g2, ..., gk are not required to be continuous.

HW 3 due Wednesday Oct.2, 11:59pm (upload to ELMS)      Problems to solve include 1.2.4 in Bickel and Doksum, plus five more, given in LaTeX'ed pdf file here. Readings to do during this time: Chapter 1 in Bickel-Doksum, Sections 1.2, 1.4, plus "Rao-Blackwell Theorem" and "Lehmann-Scheffe' Theorem" and "Uniform Minimum Variance Unbiased Estimation", in any book, e.g. Casella and Berger.

HW 4 due Friday Oct.18, 11:59pm (upload to ELMS)      Problems to solve include 1.2.12, 1.2.14, 1.3.2, 1.3.6 (a),(b), and (e) in Bickel and Doksum, plus one more, given in LaTeX'ed pdf file here. Readings to do during this time: Chapter 1 in Bickel-Doksum, Sections 1.3, 1.4, plus sections on Decision Theory and on Sufficiency and Bayes Models in Section 1.5.

HW 5 due Friday Nov.8, 11:59pm (upload to ELMS)      Problems to solve include 1.6.5, 1.6.7, 1.6.11(e)-(f), 1.6.16, and 1.6.22 in Bickel and Doksum, plus one more with many parts [that count altogether as 2 problems], given in LaTeX'ed pdf file here. Readings to do during this time: Chapter 1 in Bickel-Doksum, Section 1.6, and review the Decision Theory material (including the finite-parameter and finite-action example) in Section 1.3.


HW 6 due Saturday Nov.23, 11:59pm (7 problems, upload to ELMS)      Readings to do during this time in Bickel & Doksum: Sec.1.6.5 on conjugate priors in natural exponential families, Chapter 2 through Sec.2.3 plus 2.4.3 on maximum likelihood, GMM and estimating equations, and Sec. 3.4.2 on Fisher Information and the "information inequality".

Problems to solve are: 1.6.32, 2.1.2(a)-(c), 2.2.13, 2.3.7 and 3.4.11 in Bickel and Doksum, plus the following two:

(6) Argue that all of the problem-parts of 2.2.10 except part (c), which you should skip are canonical-exponential-family problems with full rank and that Theorems 2.3.1 and 2.3.2 apply to show that the MLEs in all parts exist and are unique.

(7) For a Poisson(λ) sample X1, ..., Xn find the MLE and the Bayes estimators (with respect to squared-error loss) of   λ   based (separately) on priors   π(λ) ~ Gamma(2,2)   and   π(λ) ~ Expon(0.1). Show that the biases of these three estimators are all of order 1/n (or smaller) and the mean-squared errors are of order 1/n and differ from each other by an amount of order 1/n2.


HW 7 due Saturday Dec.7, 11:59pm (6 problems, upload to ELMS)      Readings to do during this time in Bickel & Doksum: Sec.4.1 through 4.3, plus 4.9.1-4.9.2. on Neyman-Pearson and Likelihoood Ratio hypothesis testing.

Problems to solve are: 4.2.3, 4.2.4, 4.3.4, 4.3.6, 4.9.4 plus the following one:

(6) Suppose that n independent observations Xi, i=1,...,n, all follow the density f(x,θ) = θ e1 - θ x I[x ≥ 1/θ], for some unknown statistical parameter θ > 0.
(a). Is this a Monotone Likelihood Ratio family of densities?
(b) Find a most powerful hypothesis test of H0: θ=1 versus H1: θ=θ1, with significance level 0.05, where θ1 is any fixed number > 1. Is the test unique with the most-powerful property ? Find the critical region explicitly in terms of the .95 quantile of a well-known distribution. Does the rejection region depend on θ1 ?



Important Dates

Return to my home page.

© Eric V Slud, Nov.25, 2024.