Homework Problem 2, STAT 705, Fall 2016. Assigned 8/31/2015, Due Friday 9/9/2015 Download the ASCII (text) data-file http://biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/titanic3.csv concerning passengers on the Titanic, with description available at http://biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/titanic.html and at other sites mentioned there. You can get these data into a data-frame Titanic3 with 14 integer and character columns (with headings) by issuing the commands > library(foreign) > Titanic3 = read.csv("http://biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/titanic3.csv", stringsAsFactors=F) That is the starting point for this problem. #--------------------------------------------------------------- (a) How many different people have data in this dataset ? NOTE: there are some duplicate names. How many ? Do you think that the duplicate-pairs represent different people ? Give reasoning based on information in the dataset. Relevant commands: unique, duplicated (b) As is true of many datasets, this one has some missing (NA) values in numeric columns and blanks in character columns (which are also "missing" in the same sense). Designate each column as numeric or character, and count the missing values. Also count the individual records (row) of the data-frame for which there are respectively 1,2, ... (and what is the maximum ?) missing fields. Relevant command: is.na, apply (c) Tabulate the survival rates, by sex and passenger class (pclass) for individuals on the Titanic3 data file. Relevant command: table (d) Exploratory (descriptive) questions, to answer as time permits: --- at what young age x does it seem that there was no longer a survival advantage to have been aged <= x on the Titanic ? Does this seem to have varied with passenger class ? --- what were the home-destinations of Titanic passengers, and the average ages of passengers and survival rates corresponding to each ? --- does the survival rate of passengers within each passenger class seem to depend at all on the fare paid ?