Log to Illustrate SAS Datastep Syntax ===================================== (I) Average in a SAS dataset calculated from scratch data pima ; set home.pima (keep = insulin diab rename = (insulin = x diab = a)); /* keeps only indicated columns and renames them: X, A */ data avgins; set pima END = LAST; if _N_=1 then totx = 0; totx + x ; /* could achieve the same thing with TOTX = TOTS + X ; if there had first been a line: RETAIN TOTX ; */ if last then do; avgx = totx / _N_ ; output; end; proc print; title "Average done from Scratch"; var avgx; proc means data = home.pima n mean ; title "NOBS & AVG, Proc MEANS"; var insulin; run; Average done from Scratch 14:15 Thursday, September 21, 2006 15 Obs avgx 1 79.7995 NOBS & AVG, Proc MEANS 14:15 Thursday, September 21, 2006 16 The MEANS Procedure Analysis Variable : insulin N Mean ------------------- 768 79.7994792 ------------------- (II) Calulate MAX of INSULIN=X within each of the two groups defined by A = DIAB = 0,1 options linesize = 50 nodate; proc sort data = pima out = pimasrt; by A; data minXgps (keep = maxx A); set pimasrt end = last ; by A; retain maxx; if first.A then maxx = 0; if X > maxx then maxx = X ; if last.A then output; proc print; title "Group MAX, from Scratch"; run; Group MAX, from Scratch 21 Obs a maxx 1 0 744 2 1 846 /* Now contrast with what we would get from proc means */ proc means data=pimasrt max maxdec=3; class A ; var x ; title "Group Max and Count, Proc MEANS"; run; Group Max and Count, Proc MEANS 23 The MEANS Procedure Analysis Variable : x N a Obs Maximum ---------------------------- 0 500 744.000 1 268 846.000 ----------------------------