Homework Problem 3, STAT 705, Fall 2015. Assigned 9/9/2015, Due 9/21/2015 (a) Define and test an R function "frmsmry" which does the following. Your input argument is: infram: a data-frame containing one or more columns NOTE: this assignment contemplates only using input data-frames with all numeric, character, or logical columns. You could do one of three things, any of which is OK with me (but say in your submitted Homework which of these options you choose): (i) you could ignore this issue, and make sure to apply your function only to data-frames with all numeric, character, or logical columns; (ii) you could write into your function a check for whether the input data-frame contains any other types of columns, and terminating with an error message if so; or (iii) you could allow factor columns too, and consider them as character and convert them to "character" before doing the main task of the function. I THINK THE BEST OPTION IS (ii), BUT THE EASIEST IS (i). Your function should compute, and output as a list, the following components: num.indx = the integer vector of column indices for which the columns are numeric char.indx = the integer vector of column indices for which the columns are character bool.indx = the integer vector of column indices for which the columns are logical num.miss = the integer vector containing the number of missing elements in each column newfram = the input data-frame with all missing character values replaced by "" (blank) and with all missing numeric values replaced by -999 NOTE THAT ONE OR MORE OF THE num.xxxx INDEX VECTORS MAY BE EMPTY, IN WHICH CASE YOU SHOULD ASSIGN THEM AS "NULL" LIST COMPONENTS. BE CAREFUL THAT YOUR FUNCTION WORKS EQUALLY WELL WHETHER THE NUMBER OF COLUMNS IS 1 OR IS GREATER THAN 1. ---- SOLN to part (a), using option (iii) frmsmry = function(infram) { ncol = length(infram) clasvec = sapply(infram, class) facs = which(clasvec=="factor") if(length(facs)) { for(i in 1:length(facs)) infram[,i] = as.character(infram[,i]) clasvec = sapply(infram,class) } num.indx = which(clasvec=="numeric" | clasvec=="integer") char.indx = which(clasvec=="character") bool.indx = which(clasvec=="logical") num.miss = sapply(infram, function(col) sum(is.na(col))) newfram = infram for(i in num.indx) if(num.miss[i]) newfram[is.na(newfram[,i]),i] = -999 for(i in char.indx) if(num.miss[i]) newfram[is.na(newfram[,i]),i] = "" list(num.indx = if(length(num.indx)) num.indx else NULL, char.indx = if(length(char.indx)) char.indx else NULL, bool.indx = if(length(bool.indx)) bool.indx else NULL, num.miss = num.miss, newfram = newfram) } > frmsmry(data.frame(V1=c("A","B","C",NA,"D"))) $num.indx NULL $char.indx V1 1 $bool.indx NULL $num.miss V1 1 $newfram V1 1 A 2 B 3 C 4 5 D > tmp = data.frame(V1 = c(1:6,NA), V2 = as.logical(c(1,rep(c(1,0) ,3))), V3 = 7:13, V4 = c(" ",letters[5:9],NA), stringsAsFactors=F) > tmp V1 V2 V3 V4 1 1 TRUE 7 2 2 TRUE 8 e 3 3 FALSE 9 f 4 4 TRUE 10 g 5 5 FALSE 11 h 6 6 TRUE 12 i 7 NA FALSE 13 > frmsmry(tmp) $num.indx V1 V3 1 3 $char.indx V4 4 $bool.indx V2 2 $num.miss V1 V2 V3 V4 1 0 0 1 $newfram V1 V2 V3 V4 1 1 TRUE 7 2 2 TRUE 8 e 3 3 FALSE 9 f 4 4 TRUE 10 g 5 5 FALSE 11 h 6 6 TRUE 12 i 7 -999 FALSE 13 ----------------------------------------------------------------- (b) Define and test an R function "sampmat" which does the following: Your input arguments are: inmat = a numeric matrix frac = a scalar in (0,1) a,b = scalar parameters, a sampmat(diag(5),.7,0,1) , , 1 [,1] [,2] [,3] [,4] [,5] [1,] 1 0.0000000 0.2370034 0.000000 0.0000000 [2,] 0 1.0000000 0.0000000 0.000000 0.8486207 [3,] 0 0.0000000 0.8941277 0.337898 0.0000000 [4,] 0 0.4264294 0.0000000 1.000000 0.0000000 [5,] 0 0.0000000 0.9631163 0.000000 1.0000000 , , 2 [,1] [,2] [,3] [,4] [,5] [1,] 1 1 0 1 1 [2,] 1 1 1 1 0 [3,] 1 1 0 0 1 [4,] 1 0 1 1 1 [5,] 1 1 0 1 1 > sampmat(array(1,c(6,3)),.3,5,7) , , 1 [,1] [,2] [,3] [1,] 1.000000 1.000000 1.000000 [2,] 5.579545 5.548244 1.000000 [3,] 5.036810 1.000000 1.000000 [4,] 6.641460 6.036512 5.817911 [5,] 6.091711 5.853334 1.000000 [6,] 5.581995 6.885847 6.135198 , , 2 [,1] [,2] [,3] [1,] 1 1 1 [2,] 0 0 1 [3,] 0 1 1 [4,] 0 0 0 [5,] 0 0 1 [6,] 0 0 0