Question 2. (To be done using 'R’). For this question, you will have to work with Pima Indians Diabetes Database data set in R, named 'PimalndiansDiabetes' from 'mlbench' library. The complete data set can be seen by simply typing PimaIndiansDiabetes into the R console, however, for the sake of this question, we will be working with subsets of this data frame. As part of your solutions for this question, you will have to print screen or save some output, so I suggest to submit the entirety of this question as a Word or LaTeX document (use \begin{verbatim} Your R output \end{verbatim}) separate to the hand- written document you scan and submit for the previous questions. a) The first step in this question, is to remove the columns that we will not be using in the analysis from the original PimalndiansDiabetes data set, including the 'diabetes' variable which contains the outcome (positive/negative), so that all 768 observations in the sample can be thought as being from the same population, i.e. assume we did not know they were separated into diabetes outcomes. To do this, we will create a new data frame, called sample.data which consists of all rows of the the PimalndiansDiabetes data frame but only the columns from 1 to 4, using the following code: > library (mlbench) > data ("PimaIndiansDiabetes") > sample.data<- data.frame (PimaIndiansDiabetes[,1:4]) i) For this new data set, we will assume that the observation vectors are from a multi- variate normal distribution. By creating and analysing Q-Q plots, conclude if this is a valid assumption and give reasons for your conclusion? ii) Calculate the sample mean vector j and sample covariance matrix S for this data; iii) Regardless of your conclusion in part i), we continue with the assumption of mul- tivariate normality. Using the HotellingsT2() function, print and analyse the output to test the null hypothesis Ho : ji= jio , where 120 lo = 70 20 [Note that you will have to install the packages "mvtnorm" and "ICSNP" to use the HotellingsT2 () function] iv) Using the plot () function, plot the sample profile for this data set, with solid points at each value, a dashed line connecting the points and sensibly named axes and title;

Question 2. (To be done using 'R’). For this question, you will have to work with Pima Indians Diabetes Database data set in R, named 'PimalndiansDiabetes' from 'mlbench' library. The complete data set can be seen by simply typing PimaIndiansDiabetes into the R console, however, for the sake of this question, we will be working with subsets of this data frame. As part of your solutions for this question, you will have to print screen or save some output, so I suggest to submit the entirety of this question as a Word or LaTeX document (use \begin{verbatim} Your R output \end{verbatim}) separate to the hand- written document you scan and submit for the previous questions. a) The first step in this question, is to remove the columns that we will not be using in the analysis from the original PimalndiansDiabetes data set, including the 'diabetes' variable which contains the outcome (positive/negative), so that all 768 observations in the sample can be thought as being from the same population, i.e. assume we did not know they were separated into diabetes outcomes. To do this, we will create a new data frame, called sample.data which consists of all rows of the the PimalndiansDiabetes data frame but only the columns from 1 to 4, using the following code: > library (mlbench) > data ("PimaIndiansDiabetes") > sample.data<- data.frame (PimaIndiansDiabetes[,1:4]) i) For this new data set, we will assume that the observation vectors are from a multi- variate normal distribution. By creating and analysing Q-Q plots, conclude if this is a valid assumption and give reasons for your conclusion? ii) Calculate the sample mean vector j and sample covariance matrix S for this data; iii) Regardless of your conclusion in part i), we continue with the assumption of mul- tivariate normality. Using the HotellingsT2() function, print and analyse the output to test the null hypothesis Ho : ji= jio , where 120 lo = 70 20 [Note that you will have to install the packages "mvtnorm" and "ICSNP" to use the HotellingsT2 () function] iv) Using the plot () function, plot the sample profile for this data set, with solid points at each value, a dashed line connecting the points and sensibly named axes and title;

Computer Networking: A Top-Down Approach (7th Edition)

7th Edition

ISBN:9780133594140

Author:James Kurose, Keith Ross

Publisher:James Kurose, Keith Ross

Chapter1: Computer Networks And The Internet

Section: Chapter Questions

Problem R1RQ: What is the difference between a host and an end system? List several different types of end...

See similar textbooks

Related questions

Question

Needed to be solve correctly I need only answers not code please just wirte answers of all part's and get thumbs up

$Question 2. (To be done using 'R'). For this question, you will have to work with Pima Indians Diabetes Database data set in R, named 'PimalndiansDiabetes' from 'mlbench' library. The complete data set can be seen by simply typing PimaIndiansDiabetes into the R console, however, for the sake of this question, we will be working with subsets of this data frame. As part of your solutions for this question, you will have to print screen or save some output, so I suggest to submit the entirety of this question as a Word or LaTeX document (use \begin{verbatim} Your R output \end{verbatim})separate to the hand- written document you scan and submit for the previous questions. a) The first step in this question, is to remove the columns that we will not be using in the analysis from the original PimalndiansDiabetes data set, including the 'diabetes' variable which contains the outcome (positive/negative), so that all 768 observations in the sample can be thought as being from the same population, i.e. assume we did not know they were separated into diabetes outcomes. To do this, we will create a new data frame, called sample.data which consists of all rows of the the PimalndiansDiabetes data frame but only the columns from 1 to 4, using the following code: > library (mlbench) > data("PimaIndiansDiabetes") > sample.data<- data.frame (PimaIndiansDiabetes[,1:4]) i) For this new data set, we will assume that the observation vectors are from a multi- variate normal distribution. By creating and analysing Q-Q plots, conclude if this is a valid assumption and give reasons for your conclusion? ii) Calculate the sample mean vector j and sample covariance matrix S for this data; iii) Regardless of your conclusion in part i), we continue with the assumption of mul- tivariate normality. Using the HotellingsT2() function, print and analyse the output to test the null hypothesis Ho : ji = jño, where () 4 120 jio = 70 20 [Note that you will have to install the packages "mvtnorm" and "ICSNP" to use the HotellingsT2 () function] iv) Using the plot () function, plot the sample profile for this data set, with solid points at each value, a dashed line connecting the points and sensibly named axes and title;$

Transcribed Image Text:Question 2. (To be done using 'R'). For this question, you will have to work with Pima Indians Diabetes Database data set in R, named 'PimalndiansDiabetes' from 'mlbench' library. The complete data set can be seen by simply typing PimaIndiansDiabetes into the R console, however, for the sake of this question, we will be working with subsets of this data frame. As part of your solutions for this question, you will have to print screen or save some output, so I suggest to submit the entirety of this question as a Word or LaTeX document (use \begin{verbatim} Your R output \end{verbatim})separate to the hand- written document you scan and submit for the previous questions. a) The first step in this question, is to remove the columns that we will not be using in the analysis from the original PimalndiansDiabetes data set, including the 'diabetes' variable which contains the outcome (positive/negative), so that all 768 observations in the sample can be thought as being from the same population, i.e. assume we did not know they were separated into diabetes outcomes. To do this, we will create a new data frame, called sample.data which consists of all rows of the the PimalndiansDiabetes data frame but only the columns from 1 to 4, using the following code: > library (mlbench) > data("PimaIndiansDiabetes") > sample.data<- data.frame (PimaIndiansDiabetes[,1:4]) i) For this new data set, we will assume that the observation vectors are from a multi- variate normal distribution. By creating and analysing Q-Q plots, conclude if this is a valid assumption and give reasons for your conclusion? ii) Calculate the sample mean vector j and sample covariance matrix S for this data; iii) Regardless of your conclusion in part i), we continue with the assumption of mul- tivariate normality. Using the HotellingsT2() function, print and analyse the output to test the null hypothesis Ho : ji = jño, where () 4 120 jio = 70 20 [Note that you will have to install the packages "mvtnorm" and "ICSNP" to use the HotellingsT2 () function] iv) Using the plot () function, plot the sample profile for this data set, with solid points at each value, a dashed line connecting the points and sensibly named axes and title;

Expert Solution