RESUME LOG WITH ADDITIONAL EXAMPLES. =================================== (I) Use of the chisq option with RxC table data, with R and C > 2 (and not necessarily equal). libname home "."; proc format ; value agedc Low- <40 = 35 40-49 = 45 50-59 = 55 60-HIGH = 65 ; value salint 0 - 250 = 1 251 - 400 = 2 401 - HIGH = 3 ; data newceo; infile "ASCdata/CEO.dat"; input age salry ; agegp = put(age, agedc.); salgp = put(salry, salint.); run; proc freq data=newceo; tables salgp * agegp / nocum norow nocol nopercent chisq ; title "Formatted Table, Recoded"; run; Formatted Table, Recoded Table of salgp by agegp salgp agegp Frequency|35 |45 |55 |65 | Total ---------+--------+--------+--------+--------+ 1 | 0 | 6 | 6 | 4 | 16 ---------+--------+--------+--------+--------+ 2 | 5 | 7 | 8 | 2 | 22 ---------+--------+--------+--------+--------+ 3 | 0 | 6 | 10 | 5 | 21 ---------+--------+--------+--------+--------+ Total 5 19 24 11 59 Statistics for Table of salry by agegp Statistic DF Value Prob ------------------------------------------------------ Chi-Square 6 10.7487 0.0965 Likelihood Ratio Chi-Square 6 12.3880 0.0539 Mantel-Haenszel Chi-Square 1 0.2272 0.6336 ... WARNING: 50% of the cells have expected counts less than 5. Chi-Square may not be a valid test. Sample Size = 59 II) Use of FORMAT to re-code the data again into 2x2 form to illustrate "agree" option. I start by finding the medians for age and salry, and using them as breakpoints. proc means data=newceo median Q1 Q3; var age salry; run; Lower Upper Variable Median Quartile Quartile -------------------------------------------------------- age 50.0000000 45.0000000 57.0000000 salry 350.0000000 250.0000000 543.0000000 -------------------------------------------------------- proc format ; value $age2gp Low - 50 = "<=50" 51-HIGH = ">50" ; value $sal2int LOW-350 = "L" 351 - HIGH = "H" ; data newceo2; set newceo (rename = (age = agenum salry = salnum)); length age $ 2 salry $ 3; /* specifying length correctly as number of digits for character variable is important if ordering is to be done lexicographically !! */ age = agenum; salry = salnum; proc freq data=newceo2; tables salry * age /agree nocum nopercent norow nocol; format salry $sal2int. age $age2gp.; run; salry age Frequency|<=50 |>50 | Total ---------+--------+--------+ L | 16 | 16 | 32 ---------+--------+--------+ H | 14 | 13 | 27 ---------+--------+--------+ Total 30 29 59 Statistics for Table of salry by age McNemar's Test ----------------------- Statistic (S) 0.1333 DF 1 Pr > S 0.7150 Simple Kappa Coefficient -------------------------------- Kappa -0.0184 ASE 0.1299 95% Lower Conf Limit -0.2729 95% Upper Conf Limit 0.2361 Sample Size = 59