Statistics 400 Applied Probability & Statistics I 
MWF 10, Rm B0421                                                            Fall 2003

This course introduces the basic mathematical ideas using single-variable calculus
necessary to perform simple probability calculations and approximations and to
formulate and address statistical problems. Wherever possible, the ideas are made
concrete as predictions about random data simulated on the computer.

Instructor:    Eric Slud, Math. Dept. Rm. 2314,  X 5-5469,  evs@math.umd.edu

Office hours: Monday 12-2 (initially) or by appointment.

Prerequisite:  Math 140-141.  Genuine proficiency in the very basic operations of
single-variable calculus is needed, i.e.,summation of simple series, integration (by
substitution and by parts), differentiation including chain rule. The course is also
geared to the solution of word problems.

Text: Probability & Statistics for Engineering and the Sciences, 6th ed. (2004),
by J. L. Devore, Duxbury Press.

Coverage:In the Devore text:  Chapter 2; Chapter 3 (omitting negative binomial
distribution, section 3.5, and Poisson process, section 3.6; Chapter 4 (omitting
Sections 4.5, 4.6); Extra topic based on handout(s): Distributions of Functions of
Random Variables, applied to Computer Simulation of Random Data; Section 5.1
(subsections on: discrete joint probability mass functions, independent random
variables); Section 5.2 (initial subsection only); Sections 5.3-5.5; Chapter 1,
especially section 1.2; Chapters 6, 7, 8; Chapter 9, sections 9.1-9.3 as time
permits.

Grading: The grade in the course will be based  35% on homeworks (about 9, graded),
40% on 3 in-class tests (approximately equally spaced across the term, the first on
October 3), and 25% on a comprehensive final (Saturday, Dec. 20, 8-10a.m.).

Note on using the text. With this text, you should not need another text for backup,
but it would still be a good  idea to use Schaum's Outline in Probability & Statistics
or some other introductory book as a source of auxiliary (worked) problems.
      The exam and at least some of the tests will be open-book, so throughout the
course you should maintain your own list of the most important definitions and formulas
for easy reference.

Note on HW: Homework assignments are to be handed in, on time, and will be graded.
Initially, about half of the assigned problems (chosen at random by me, after you hand
your homework papers in) will be graded. You should show complete reasoning on the
problems you hand in to get full credit since odd-numbered solutions, at least, can be
found at the back of the book.

First Assignment: read Ch. 2, sections 2.1, 2.2 and 2.3 by next Monday (9/8), and
the rest of Chapter 2 by the following Friday (9/12).

Problem Set 1 due Wednesday, 9/10: Sec 2.1 (p. 57) # 4, 6; Sec. 2.2 (pp. 65-66)
# 18, 20, 26; Sec. 2.3 (pp. 74-75), # 34, 40, 41; Sec. 2.4, # 48.

Note: I will grade only problems which are in sections we have already covered
by the due-date of the assignment (and for now, not all of those either). This means
that the last problem will not be graded this time around.

Problem Set 2 due Friday, 9/19:   Sec. 2.3 # 39, 42. Sec. 2.4 # 50, 51, 60.  Sec. 2.5 # 72, 78.
Sec. 3.1 #4 (Also give the probabilities of your selected outcomes for X,  if all zip codes
00000 to 99999 were equiprobable); Sec. 3.2 # 18, 20.

Solutions to past (selected) HW problems [the trickiest ones] are
available at this link.

Problem Set 3 due Wednesday, 10/1:   Sec. 2.4 #62; Sec 2.5 #82, 83;  Sec. 3.2 #14, 22;
Sec. 3.3 #35, 38. Sec. 3.4 #46, 62; Sec. 3.5 #64, 66; Sec. 3.6 #76, 84.  (6 will be graded)

A sample test for in-class test 1,  can be found here .

Problem Set 4 due Friday, 10/17:   Sec. 3.5 #65, 71; Sec. 3.6 #79; Sec. 3.3 #31,
Sec. 4.1 #2, 3, 6; Sec. 4.2 # 11, 20; Sec. 4.3 # 32, 50; Sec. 4.4 #59. (6 will be graded)
NOTE change of due-date for HW4 !!

Problem Set 5 due Wednesday, 10/29:   Sec. 4.2 #24; Sec. 4.3 # 35, 42, 51;
Sec. 4.4 # 64; Sec. 4.5 # 66, 72; Sec. 5.1 #3, 7; Sec. 5.2 # 26, and for this problem also
find the correlation of X and Y;  Sec. 5.3 #38, 42.
In addition, read the Handouts  Transformation of Random Variables  and
Random Number Generation and  Simulation below.We will cover this material in
the next couple of classes., and do Problems TRAN.1 and TRAN.2 and  Sim.1 from
these handouts, to be be handed in with HW5.

Problem Set 6 due Wednesday, 11/5:   Sec. 5.2 #22; and for this problem also
find the correlation of X and Y;  Sec.5.3 #37; Sec. 5.4 #46, 48, 56; Sec. 5.5 #58, 60;
plus  Sim.3 from the  Random Number Generation and  Simulation handout below.

A sample test for in-class test 2,  can be found here .

Problem Set 7 due Monday, 11/24:   Sec. 5.5 #58, 60; Sec. 6.1 #3, 6, 9;
Sec. 1.3 #34(a,b), 39. Sec. 7.2 #14, 20.
Also hand in:
(1) Using the data of #20 in Section 1.2, (a) graph the empirical distribution function,
     (b) use (a) to find the sample median and 0.75 quantile, and (c) construct and
     graph  a scaled relative frequency histogram using the intervals boundaries 0, 1000,
     2000, 3000, 4000, 5000, and 6000. Note that for it to be scaled, the vertical units
    must be chosen so that the total area in the histogram bars is 1.
(2)  Suppose that random variables X_1, X_2, ...., X_60  are independent and Expon(2)
     distributed. Find approximately the numerical probability that X_1 + ... + X_60  falls
     between  26 and 36.

Problem Set 8 due Wednesday, 12/3:   Sec. 7.2 #21, 25.  Sec. 7.3 # 30, 33, 34, 38.
     Supplementary Exercises Ch.7: #48, 52. Ch. 8 Sec. 1: # 9, 10, Sec 8.2: #16, 22.
 

A sample test for in-class test 3,  can be found here .
 

NOTE: although I did not explicitly list the types of problems:
     sample quantiles and SCALED relative frequency histograms;
     sample-size calculations (to achieve specified precision);
       and   prediction intervals
  in my summary of questions which MAY appear on the
  in-class exam (Test 3) on Friday, December 5, they ARE
  fair game and you should review these topics and prepare
  your cheat-sheets accordingly.
 

A Review Session for the Final Exam will be held on Wednesday,
December 17, 2003, from 3 to 5pm, in Room Mth 0103 .

A sample of Final Exam problems can be found here .

See also a quick summary and outline of topics for you to study
in reviewing for the exam.


To see the Tests given earlier this term, along with the Solutions
handed out for Tests 2 and 3, click  here .
 


Handouts:

(1) 9/29/03 This handout concerns numerical calculations for the Binomial approximation
    to Hypergeometric random variables, and the Poisson approximation to the Binomial.
    In addition, some simulated-data results are given to show that the expectations and
    probability mass functions behave as they should according to the relative-frequency
    interpretation of probabilities.

(2) 10/20/03 There are two handouts here, respectively on  Transformation of Random
Variables  and on Random Number Generation and SimulationThese topics are very
important for the rest of the course, as they allow us to generate and interpret `artificial data'
to illustrate the meaning of our Probability Limit Theorems (Law of Large Numbers, Central
Limit Theorem) and later statistical results (Consistent Statistical estimators, Confidence
Intervals). In addition, Simulation gives us an `experimental' avenue to calculate via artificial
data probabilities which may be too difficult to figure analytically.

(3) 10/22/03 The handout on Normal Approximation to Binomial Distribution contains a
word-problem worked example, as well as some numerical examples of the quality of the
normal approximation to the Binomial.  This example is continued below, in a statistical
setting (confidence interval for estimate of a population proportion in a political opinion poll)
in handout (7) below, dated 11/19/03.
     A graph comparing the distribution function values of Binom(100,.3) with its
approximating normal distribution N(30,21) can be found here.

(4) 10/27/03 A preliminary version of this handout was distributed and discussed in class:
this is our first  Example of Simulation for Calculating Probability and Expectation.

(5) 11/3/03 This picture handed out in class shows the behavior of sample averages  Sn/n
as a function of  n  from  1,...,2000  on each of four sets of simulated data, from different types
ofrandom variables. Within each picture, the sample averages Sn/n are based on progres-
sively larger segments of the same 2000 data-values, and the point is to see that these
averages settledown to the place where the Law of Large numbers guarantee they should
for large enough  n,  namely the theoretical expectation ofthe individual r.v.'s.

(6) 11/12/03  Pictures handed out in class to show the behavior of scaled relative frequency
histograms by comparison with densitiesPictures in this document show the overlaid plots of
histograms in large simulated samples with the theoretical densities they are supposed to
represent. Pictures overlaying empirical distribution functions with the theoretical cdf's the
data in large simulated datsets are supposed to represent, are also available, in two settings:
     (i) The overlaid empirical and theoretical cdf's for 1000 simulated values of Z1+Z2 (sum of
two independent standard normal deviates) can be found here .
    (ii) The overlaid empirical and theoretical cdf's for 1000 simulated values of U_1+...+U_100
(sum of 100 independent Uniform[0,1] independent deviates) can be found here .

(7) 11/19/03  The word-problem on political opinion polling begun in handout (3) above,
dated 10/22/03, is continued here from the vantage point of statistics, particularly
confidence intervals for estimates of a population proportion in a political opinion poll.



15-week Course coverage based on Devore Book and 38 classes

Class #      Sections         Topics
 & Date

(1)    9/3         1.1, 2.1         Overview & Sample Spaces: Probability as limiting relative
                                                    frequency under replicated experiments.
(2)    9/5           2.2               Events, Prob. axioms, Equiprobable outcomes.
(3)   9/8,9/10   2.3               Counting techniques, combinatorial examples.
(4)    9/12         2.4               Conditional prob. via urn problems, Bayes rule.
(5)    9/15         2.5               Independence & dependence from sampling with &
                                                    without replacement. Examples like craps (prob that A
                                                    occurs before B in repeated trials) or other combinatorics.
(6)    9/17       3.1-3.2         Definition of random variable, prob. mass fcn.,
                                                    combinatorial examples of calculation.
(7)    9/22         3.2                Distribution function, general & combinatorial examples.
(8)    9/24         3.4                Binomial distribution, tables, word-problems.
(9)    9/26         3.5                Hypergeometric distribution: def'n through finite sampling
                                                    (or through conditional dist of binomial X given X+Y).
                                                    Binomial as large-population limit of hypergeom.
(10)   9/29        3.6                Poisson as limit of binomial. Examples,  overview of discrete
                                                    distributions (uniform, binom, hypergeom, geometric, Poisson)
                                                    determined by qualitative properties ("word problem types").
(11)   10/1        3.3                 Expectation, def'n as large-sample average via relative freq's.
                                                    Calculation for binomial and made-up gambling examples.

(12)   10/8-10   3.3               Expectation of function of r.v., with  variance (via formula
                                                     EX^2-(EX)^2) as illustration. EITHER additional moment-
                                                     calculations OR word-problem examples. In addition, mean
                                                     and variance for binomial, Poisson, Hypergeometric, geometric.
(13)   10/13   4.1-4.2          Definition of continuous r.v.'s, density and distribution fcns.
(14)   10/15   4.3-4.4          Uniform, exponential & normal densities: probability calculations.
(15)   10/17   4.2                   Calculations of expectations, variances: interpretations of param-
                                                     eters in Uniform, exponential, normal. [Extra topic omitted:
                                                     gamma integrals for exponential moments & gamma density.]
                                                 Quantiles of continuous random variables.
(16)    10/20 [Extra topic]  Change of variable: distribution and density of function of r.v.
(18)-(19)  [Extra topic]    Application of univariate change of variable either to develop other
           10/22                               distributional examples (Weibull, lognormal) as in Section 4.5, OR
                                                     to simulate from density F' as F^{-1}(U), where  U ~ Unif[0,1].
(20)   10/24   5.1                   Only subsections on discrete joint prob. mass functions, indep. rv's.
(21)   10/27   5.2                    Initial subsection only. [Optional extra topic: covariance and
                                                     correlation as measure of dependence.]
(22)-(23)  11/3    5.3-5.4   The Central Limit Theorem, Law of Large Numbers.
(24) -(25)  11/5 5.4-5.5  Applications of CLT. Normal approximation to binomial dist'n.
                                                  Mean & variance formulas for weighted sums of iid variables.

REVIEW   11/5  FOR IN-CLASS TEST 2, to be held  11/7/03

(26)   11/10                             More on CLT, exactness of CLT for normal r.v.'s.
(27)   11/12     1.2                   Scaled relative frequency histograms & emipirical distribution
                                                     functions. Computer-simulated examples. Connection of sample
                                                     quantiles (ie, quantiles in empirical distribution) with true
                                                     population quantiles.
(28)   11/14    6.1                   Notion of a parameter (e.g. unknown mean or variance) of a
                                                     distribution.  (More discussion of empirical df's & histograms.)
(29)   11/17   6.1, 6.2            Statistic minus parameter as r.v. Verification that sample
                                                     variance is unbiased and consistent for true variance.
                                                     Method of moments estimators (skip MLE).
(30)  11/19   7.1, 7.2             Confidence intervals, def'ns and terminology.   Large-sample
(31)  11/21       "                         intervals for mean or proportion (variance known or estimated).
(32)   11/24   7.3                    Confidence interval,  (finite-sample, unknown-variance,
                                                       normal data).  t distribution and tables.
(33)   11/26 7.2-7.4              One-sided intervals, additional word-problem examples.
(34)   12/1      8.1                    Prediction intervals.  Beginning hypothesis test terminology.

REVIEW   12/3  FOR IN-CLASS TEST 3, to be held  12/5/03

(35)   12/8    8.1-8.2            Hypothesis test terminology. Normal-data example, known variance.
                                                      Hypothesis test about population mean, more generally.
(36)   12/10   8.3-8.4            Examples  of calculating power. Sample size formulas.
                                                      Hypothesis test for population proportion.
(37)   12/12   8.5                     P-values. Duality between hypothesis tests & confidence intervals.

REVIEW SESSION FOR FINAL EXAM will be scheduled 12/15 or 12/16


COURSE OUTLINE

I. Descriptive Statistics and Data Presentation
     Sample space, events as subsets
     (Scaled) relative-frequency histograms, sample quantiles and moments.
II. Probability Fundamentals
     Probabilities as limiting relative frequencies. Probability axioms.
     Counting techniques, equally likely outcomes.
     Conditional probability, (mutually) independent events.
     Bayes' rule.
     *Subjective probabilities as betting-odds.
III. Discrete Random Variables
     Probability mass function, distribution function, expected values, moments.
     Binomial, hypergeometric, Poisson distributions.
     Binomial as limit of hypergeometric, and Poisson  as limit of binomial.
IV. Continuous Random Variables
     Probability density function, distribution function, expected values, moments.
     *Theoretical quantiles for continuous random variables.
     Uniform,  exponential, Normal distributions.
     *Gamma function and gamma distribution.
     *Transformation of random variables (by smoothly invertible functions):
            distribution function and density.
     *Simulation of pseudo-random variates of specified distribution (as inverse
            d.f. of Uniform).
V. Joint distributions, Random Sampling.
      Bivariate random variables, joint (discrete) probability mass functions.
      *Expectation of function of jointly distributed random variables.
      *Covariance and correlation.
      Mutually independent random variables. Sums of independent random variables,
            and their means and variances.
      Law of Large Numbers, Central Limit Theorem.
      *Connection between scaled histograms of random samples and probability density.
VI. Point Estimation
      Populations, statistics, parameters, and sampling.
      Properties of estimators: consistency  (*and unbiasedness)
      Estimation of mean, variance, proportion.
      Method of moments estimation.
      *Estimators as population characteristics of  the Empirical Distribution.
VII. Confidence Intervals
      Large sample confidence intervals for means and proportions using Central Limit Thm.
      Small sample methods for normal populations, Student t distribution.
      *Small sample confidence interval for variance in normal population, chi-square distribution.
      Hypothesis testing about means using Confidence Interval.
      Hypothesis testing definitions (type I and II errors, significance level, power), examples
            using binomial and normal data.
 


Important Dates

Return to my home page.

© Eric V Slud, December 10, 2003.