HW 18 Stat 705  Fall 2015

Assigned Wednesday 12/2/15 DUE Monday 12/14/15

Consider the dataset "Traffic" in the R package "MASS".
To access it, execute the R commands:

> library(MASS); Traffic=Traffic

Use help(Traffic) to learn some of the details of the experiment, 
in which the table below shows the numbers among 92 matched day-pairs 
in 1961 and 1962 on a motorway in Sweden in which the 1961 day and/or 
the 1962 day were such that traffic speed limits were "in effect 
and enforced".

> table(Yr1961=Traffic$limit[1:92], Yr1962=Traffic$limit[93:184])
      Yr1962
Yr1961 no yes
   no  28  43
   yes 16   5

Find the best 90% two-sided confidence interval you 
can for the expected number of traffic accidents on 
a day without a speed limit minus the expected number 
of traffic accidents on a day with a speed limit.

Assume the qualitative "model" that the days on 
which the experiment was conducted in 1961 are in 
every way comparable to the same-numbered days on 
which the experiment was conducted in 1962. Also 
assume that of the days among the 92 in each year 1961 
and 1962 on which the experiment was conducted, 
respectively  28 , 16, 43, and 5 were drawn randomly
(equiprobably) from 1:92, respectively to fit the 
four categories:
     No limit in 1961 or 1962 
     Limit in 1961 and No limit in 1962
     No Limit in 1961 and limit in 1962
     Limit in 1961 and Limit in 1962.

You may look at the data on numbers of accidents to 
try to fit a parametric distribution model to the numbers 
of accidents on a road with no speed limit and the numbers 
of accidents on a road with a speed limit. (Assume 
that the numbers of accidents are otherwise independent 
for the different roads and different years, and that 
the distribution of accidents does not depend on the year, 
only on whether there is a speed limit.)

Or else, you may treat the problem without further 
distributional assumptions. Legitimate tools include 
parametric and nonparametric bootstrap and "permutational"
distributions for the statistic 
(Avg # deaths on roads with no limit) - (Avg # on roads with limit)

or for the statistic
(# deaths on no-limit road minus # deaths on limit road in pairs 1:92
with one of each)/(# of such pairs).

GIVE YOUR REASONING FOR WHATEVER ANALYSIS YOU DO, AND EXPLAIN WHY 
AND UNDER WHAT ASSUMPTIONS (BEYOND THE ONES I IMPOSED ABOVE, IF 
YOU NEED ANY) IT SHOULD BE VALID.