SAS Log of Examples & Analyses Related to Frequency Tables ================================================== 9/25/06 data tbexmp; input count Cand $ Gender $ ; datalines; 70 Dewey F 30 Truman F 40 Dewey M 40 Truman M ; run; proc freq data=tbexmp; tables Cand * Gender / chisq nopercent norow nocol expected agree; weight count; run; The FREQ Procedure Table of Cand by Gender Cand Gender Frequency| Expected |F |M | Total ---------+--------+--------+ Dewey | 70 | 40 | 110 | 61.111 | 48.889 | ---------+--------+--------+ Truman | 30 | 40 | 70 | 38.889 | 31.111 | ---------+--------+--------+ Total 100 80 180 NOTE: for expected, take rowtot * coltot / tabltot, e.g. 61.111 = 110*100/180 = 550/9 = So (1,1) cell is greater than expected. CHISQ X^2 = sum (OBS- EXP)**2 /EXP = sum (n_{ij}-m_{ij})^2/m_{ij} = (70-61.111)^2*(1/61.111 + 1/48.889+1/38.889+1/31.111) Lik Ratio G^2 = 2* sum n_{ij} log(n_{ij}/m_{ij}) =. CHISQ Mantel-Haenszel QMH = (n-1) corr^2, where corr = [(40/180) - 31.111/180]/sqrt(110*70*100*80/180^4) STAT df Value Prob Chi-Square 1 7.4805 0.0062 Likelihood Ratio Chi-Square 1 7.4930 0.0062 Continuity Adj. Chi-Square 1 6.6626 0.0098 Mantel-Haenszel Chi-Square 1 7.4390 0.0064 Fisher's Exact Test ---------------------------------- Cell (1,1) Frequency (F) 70 Left-sided Pr <= F 0.9981 Right-sided Pr >= F 0.0049 Table Probability (P) 0.0030 Two-sided Pr <= P 0.0087 Sample Size = 180 phyper(70, 100, 80, 110, lower.tail = F) = .001915563 = P(X>=70) P(X <= 70) = 0.995094, etc. McNemar's Test = (n_{12}-n_{21})^2/(n_{12}+n_{21}) ----------------------- Statistic (S) 1.4286 DF 1 to test n_{1.} = n_{.1} Pr > S 0.2320 So statistic = (10)^2/70 = 1.429 Simple Kappa Coefficient In 2x2 case, Kappa = -------------------------------- (n_{11}+n_{22}-n_{1+}n_{+1} -n_{2+}n_{+2})/ (n-n_{1+}n_{+1}-n_{2+}n_{+2}) Kappa 0.2025 ASE 0.0731 95% Lower Conf Limit 0.0593 95% Upper Conf Limit 0.3458 > .2025 + c(-1,1)*1.96*.0731 = c(0.059224, 0.345776)