**Meeting 2-3:15pm, Thursdays (regular room MTH1313)
Fall 2010**

**Eric
Slud,
Statistics Program ,
Math Department**

Interested participants should email to:
**evs@math.umd.edu**

**RIT Focus:** **Biased Sampling** generally refers to
the statistical analysis of data such that the population

on which we
see data differs (in ways which we either know or model) from the
target population. This

topic is closely related to the unequal
probability sampling strategies in Sample Surveys, and to the still

more unequal probabilities with which selected units in the population
respond (i.e., provide data). This

kind of differentially missing data
is in turn closely related to notions of `censoring' in
biostatistical

studies. Unequal probabilities of sampling in
biostatistical contexts arise in connection with `prevalent

cohort' and other epidemiologic cross-sectional sampling
strategies. When biostatistical studies have

entry criteria related to the previous occurrence of some symptoms or
other biological condition (such

as being `infected' or having a
disease advanced to a specified stage), we have biased sampling.

We will read papers and
background texts concerning sampling designs with unequal
mechanisms

of selection, unequal probabilities of response,
parametric and nonparametric identifiability and analysis

of
data. The statistical machinery will involve some discussion of
Estimating Equations, semiparametric

statistics, and some
histoical discussion on the attempts that have been made to connect
survey data to

the Likelihood concept.

**Prerequisites:** Participants should have had a course
in Mathematical Statistics (at the level of

Stat 700-701 or higher)
and some introduction to survey or biostatistical (survival) data.

**Topics by Keyword:
**

to Horvitz-Thompson survey estimator

**Reading List**
(Still under construction)

**Books**

Fitzmaurice, G., Davidian, M., Verbeke, G. and
Molenberghs, G. eds. (2008) **Longitudinal Data Analysis**,

Handbooks of Modern Statistical
Methods, Chapman & Hall/CRC.

Korn, E. and Graubard, B. (1999) **Analysis of Health
Surveys**, Wiley.

Little, R. and Rubin, D. (2002, 2nd ed.) **Statistics of Missing
Data**, Wiley.

Tsiatis, A. (2006) **Semiparametric Theory and Missing Data**
(Springer Series in Statistics).

**For a current list of very useful
references related to sample survey theory, compiled by Mikhail
Sverchkov of Bureau of Labor Statistics, click here**.

** Miscellaneous Papers & Reports**

Addona, V. and Wolfson, DB. (2006). A formal test for the stationarity of
the incidence rate using data

from a prevalent cohort study with follow-up. Lifetime Data Analysis.

Asgharian, M., Wolfson, DB. and Zhang, X. (2006). Checking
stationarity of the incidence rate using

prevalent cohort survival
data. Statistics in Medicine.

Chen, Jinbo and Norman Breslow (2004), *Semiparametric efficient
estimation for the auxiliary
outcome problem with the conditional mean model*,
Canad. Jour. Statist. 32, 1-14. Click here for pdf.

Gilbert, Peter B. (2000) *Large sample theory of maximum likelihood
estimates in semiparametric
biased sampling models.* Ann. Statist. 28, 151--194.

Huang Y, Wang MC. (1995), *Estimating the occurrence rate for
prevalent survival data in competing
risks models.* Journal of the
American Statistical Association 80,1406-1415.

Kang, J. and Schafer, J.L. (2007), *Demystifying Double
Robustness: A Comparison of
Alternative Strategies for Estimating a Population Mean from
Incomplete Data*, Statist. Sci. 22, 523-539.

Korn, E. and Graubard, B. (2003) *Estimating variance components
by using survey data.*,

J. R. Stat. Soc. Ser. B 65, 175--190.

Mandel, M. and Fluss, R. (2009) *Nonparametric estimation of the
probability of illness in the
illness-death model under
cross-sectional sampling.* Biometrika 96, 861-872.

Patil, G. P. and Rao, C. R. (1978). *Weighted distributions and
size-biased sampling with applications
to wildlife populations and
human families.* Biometrics 34 179-189.

Pfeffermann, D. and Sverchkov, M. work on survey data with
semiparametrically modelled

informative nonresponse.

Qin, J. (1994ff) Ann. Statist. papers on empirical likelihood.

Rao, JNK and Wu, C. (2009), *Bayesian pseudo-empirical-likelihood
intervals for complex surveys*,

J. R. Stat. Soc. Ser. B 72, 533--544.

Rotnitzky and Robins papers (some with other co-authors) on
inverse-probability weighted estimating

equations for
longitudinal studies (eg AIDS) with informative dropout patterns.

Donald Rubin papers (with P. Rosenbaum and others) on Propensity Scores.

Yehuda Vardi papers (referenced in Gilbert paper above) on
nonparametric estimation of an

underlying distribution function
in a biased-sampling setting.

Annals of Statistics paper, on nonparametric estimation under length-biased sampling.

on "A paradox concerning nuisance parameters and projected estimating functions" which is

related to ratio estimation in survey sampling but is primarily about estimating equations.

or home page of the same name) and how to use it in biased sampling problems.

on empirical likelihoods in survey sampling.

sampling is noninformative (ie not dependent on the measured attribute of interest).

the area of `informative' sampling, using papers of J. Beaumont (2008) and Sverchkov and

Pfeffermann (2004). [For precise references, see the bibliography document on

Survey Sampling linked within the Reading List above.]

at the RIT in MTH 1313 a 20-minute presentation on research problems and opportunities

for collaboration in his NIH Branch.

This presentation will immediately precede Dr. Albert's 3:30pm Statistics Seminar.

Prevalent Survival Data in Competing Risks Models.

estimation in the illness-death model from prevalent cohorts.

in time of prevalent cohorts, from papers (listed above) of Addona and Wolfson (2006) and

Asgharian, Wolfson, and Zhang (2006).