Quick access to Directories: Logs used in Lectures Illustrative Scripts Printed pdf Handouts
Click here for a generic Stat 430 Course Syllabus, and here for the current course outline.
Pointer to a listing of current readings and web-page materials to work through, by week.
Pointer to new Homework Assignments and old HW solutions
Pointer to Illustrative Scripts Pointer to Printed Handouts
Instructor: Eric Slud, Statistics Program, Math. Dept.
Office: Mth 2314, x5-5469, email evs@math.umd.edu
Office hours: M 2-3, Th 11-12
This course is an introduction to statistical and graphical techniques of data analysis and their implementation in the SAS programming language/platform. The emphasis is on data analysis skills, but since one such important skill is justification of assumptions and understanding of the rationale behind analyses, the course develops ideas and explains concepts from statistical theory.Prerequisite: Stat 400. The material needed is mostly definitions and
concepts, and some basic algebraic manipulations involving probabilities. But later
in the course, some understanding of Stat 400 material on distributions of functions
of random variables will help you make sense of statistical simulation
methods. (That material will be reviewed as needed.)
Text: Ronald Cody and Jeffrey Smith, Applied Statistics and
the SAS Programming Language,
5th ed., Prentice-Hall, 2006.
Course requirements and Grading: Most of the
work for the course will consist of 8--10 graded Problem Sets.
These will involve writing and running small SAS programs and
interpreting the sequence of data-analysis operations and
outputs.
While you will be permitted to share hints and information
concerning SAS programming, the reasoning behind analyses, summaries
of them, interpretation of results, and the edited copies you hand in
must be exclusively your own work.
In addition, there will be an in-class test toward the end
of March, on basics of the SAS language and concepts underlying
data-display and statistics in categorical data, two-sample
comparisons, and simple linear regression. Finally, there will be
a slightly more ambitious data-analysis term project in place of a
Final Exam (due Friday, May 15, 5pm).
The course grade will be
based on a weighted average of your homework, test, and project
grades, with 50% weight on Homework scores (with none dropped)
and 25% for the Test and 25% for the Term Project.
The In-class Midterm test will be given on Monday, March 30,
2009. It will cover
material from Chapters 1-3 (omitting
sections 3.M, 3.P and 3.R), 5 through 5.F, 6.A-B,
13.A-D,H-K,
14.A-D. plus several handouts (the ones on Plotting, Histograms,
QQplots,
Empirical Distribution Functions, and Partial Correlations.)
Only material covered in
class (through Wed., Mar. 25) and handouts
and scripts will be within scope
for the test.
NOTE: you can bring one 2-sided notebook sheet to the test
as a memory aid.
Except for your notebook sheet, the test is
closed-book. You can use a calculator,
but I will not ask for
much if any arithmetic.
Click here for Data Analysis Term
Project Guidelines.
(Due Date: Friday May 15, 5pm)
The University of Maryland, College Park has a nationally
recognized Code of Academic Integrity, administered by the Student Honor Council.
This Code sets standards for academic integrity at Maryland for all
undergraduate and graduate students. As a student you are responsible
for upholding these standards for this course. It is very important for
you to be aware of the consequences of cheating, fabrication,
facilitation, and plagiarism. For more information on the Code of
Academic Integrity or the Student Honor Council, please click
here.
Homework Assignments and
Solutions.
The current problem assignment can be found here. Here is an additional link for current reading
which
you should do in order to follow the lectures and practice the tools
for the Homework.
Solutions to that assignment and
selected problem solutions (other than those included
in example
Scripts) will be posted to the
HWSoln Directory as the term progresses.
HOMEWORK GUIDELINE
Please remember for all Homework and Project
papers to be handed in for this course:
the consistent
guideline is to hand only as much SAS code as will show that you
did
the computations correctly using SAS, and only as much output,
edited into a coherent
narrative where narrative and explanations
are requested, as is needed to answer the
questions asked and to
justify the sequence of steps and conclusions you have made.
You will be graded down for handing in lots of extraneous material
!
HANDOUTS
(with many more to come)
(1). Click SASintro
for a step-by-step discussion about how to get started in SAS on
University or other machines.
By clicking here, you can download free
`X-windows' software that will allow you
to create the X-windows
needed to use SAS in your WAM account from a home PC.
NOTE: the xlivecd software works only for
Windows versions up to and including XP,
not Vista. Another approach
which which works very well is to install the free software Xming:
do this by following the instructions
here VERY closely.
(2). Click Plotting for some information about how to generate high-quality plots in SAS.
(3). Click
here for a handout containing a useful list of available SAS
functions (of which
the Sample Statistics, Quantile Functions, and
Probability & Density functions will be
the most useful in this
course.)
(4). Click here to find a copy of the course outline and the current problem assignment.
(5). For handouts related to material
covered in class on February 6 and 9, 2009, click
Empirical Distribution Functions
or Scaled Relative Frequency Histograms.
(6). For a handout discussing the relative
interpretability of relative risks and odds
ratios in
analyzing two-way frequency-table datasets, click here.
(7). Click here
for a sample test indicating coverage by topics along with some
sample questions. For a recent Sample Test,
click here. For an outline of the topics
and types of questions particularly relevant for this semester (Fall
2008), click here.
A new set of Fall 2008 sample test problems
can be found here.
(8). A handout giving the theoretical formulas
for confidence and prediction
intervals in simple linear
(normal-errors) regression can be found here. It contains
justifications and formulas
for the calculations SAS does of CLM and CLI confidence
and prediction
limits.
(9). A handout and Worksheet on Partial Correlation, including
definitions
and three problems. This Worksheet contains three
Problems which are
to be handed in as part of Homework Set 5.
(10). For a freely downloadable textbook on "Residuals and
Influence in Regression",
by Cook and Weisberg, visit this website.
SCRIPTS
I have provided a series of illustrative scripts, including handouts
from class and expanded examples of working SAS programs discussed
in class.
Click Scripts to find the
directory of text Logs and Scripts of SAS example sessions.
DATA DIRECTORY: Click here to find a directory of available Datasets.
Throughout the term, additional links will be posted here
to various online
data sources and repositories: