Practice of Statistics in the Life Sciences
Practice of Statistics in the Life Sciences
4th Edition
ISBN: 9781319013370
Author: Brigitte Baldi, David S. Moore
Publisher: W. H. Freeman
Question
Book Icon
Chapter 28, Problem 28.14AYK

(a)

To determine

To make a scatterplot with taste on the y axis and find the correlation coefficient and explain which relationships are linear and which have the strongest correlation with taste.

(a)

Expert Solution
Check Mark

Answer to Problem 28.14AYK

The strongest correlation with taste is hydrogen sulfide.

Explanation of Solution

In the question, it is given that experimenters assessed the concentration of lactic acid, acetic acid and hydrogen sulfide in thirty randomly chosen pieces of cheddar cheese. The table is given which shows the data. The scatterplot with taste on the y axis is as follows:

Practice of Statistics in the Life Sciences, Chapter 28, Problem 28.14AYK

As we can see in the scatterplot that all the lines are almost parallel and also that the R -square of the hydrogen sulfide is largest with taste so the correlation is largest for the hydrogen sulfide. The correlation is given in the scatterplot above by finding square root, the calculation is as:

    Acetic=SQRT(0.302)
    Lactic=SQRT(0.3055)
    H2S=SQRT(0.5712)

And the result is as:

    Acetic0.549545
    Lactic0.552721
    H2S0.755778

(b)

To determine

To use a software to obtain the regression equation and run inference for a regression model that includes all three explanatory variables and interpret the software output, including the meaning of the value taken by R2 .

(b)

Expert Solution
Check Mark

Answer to Problem 28.14AYK

The equation is y^=113.9353.483x194.631x2+5.241x1x2 .

Explanation of Solution

In the question, it is given that experimenters assessed the concentration of lactic acid, acetic acid and hydrogen sulfide in thirty randomly chosen pieces of cheddar cheese. The table is given which shows the data. Now, run inference for a regression model that includes all three explanatory variables and interpret the software output by using the Excel, the result will be as:

    Regression Statistics
    Multiple R0.800438
    R Square0.640701
    Adjusted R Square0.599243
    Standard Error10.29053
    Observations30
    ANOVA
      df SS MS F Significance F
    Regression34909.6191636.5415.454385.68E-06
    Residual262753.268105.8949
    Total297662.887   
      Coefficients Standard Error t Stat P-value
    Intercept-32.856620.2335-1.623870.116466
    Acetic2.0006544.3464750.4602940.649132
    H2S4.5663481.1769173.8799250.000639
    Lactic13.671176.6432592.0579020.049755

And the equation is as:

  y^=b0+b1x1+b2x2+b3x3y^=32.86+2.001x1+4.567x2+13.671x3

And R2=64.07% explains the variations in the model by the explanatory variables.

(c)

To determine

To explain which explanatory variable does it describe and create a new regression model that excludes this explanatory variable and interpret the software output and compare it with your findings in (b).

(c)

Expert Solution
Check Mark

Answer to Problem 28.14AYK

That explanatory variable is Acetic.

Explanation of Solution

In the question, it is given that experimenters assessed the concentration of lactic acid, acetic acid and hydrogen sulfide in thirty randomly chosen pieces of cheddar cheese. The table is given which shows the data. In the above result in part (b), we can see that the explanatory variable Acetic has a P-value greater than the level of significance so it is not significant. Thus, we will remove this variable and run this test with the other two variables using Excel and the result will be as:

    Regression Statistics
    Multiple R0.798607
    R Square0.637773
    Adjusted R Square0.610941
    Standard Error10.13922
    Observations30
    ANOVA
      df SS MS F Significance F
    Regression24887.1832443.59223.769461.11E-06
    Residual272775.704102.8038
    Total297662.887   
      Coefficients Standard Error t Stat P-value
    Intercept-24.46098.629104-2.83470.008581
    H2S4.8586620.9763054.9765813.24E-05
    Lactic14.286726.4115932.2282630.034385

In this all the explanatory variables are statistically significant but in the above model in (b) all are not statistically significant but the variations explained are approximately equal.

(d)

To determine

To explain which explanatory variable of the two has the less significant or larger value and create a new regression model that excludes this explanatory variable and keeps only significant one and explain how does this last model compare with the model in (c).

(d)

Expert Solution
Check Mark

Answer to Problem 28.14AYK

The explanatory variable of the two has the less significant or larger value is lactic.

Explanation of Solution

In the question, it is given that experimenters assessed the concentration of lactic acid, acetic acid and hydrogen sulfide in thirty randomly chosen pieces of cheddar cheese. The table is given which shows the data. In the above result in part (c), we can see that the P-value for the Lactic is larger than the hydrogen sulfide thus, we will remove the Lactic variable and then run the regression analysis using the Excel as:

    Regression Statistics
    Multiple R0.755752
    R Square0.571162
    Adjusted R Square0.555846
    Standard Error10.83338
    Observations30
    ANOVA
      df SS MS F Significance F
    Regression14376.7464376.74637.292651.37E-06
    Residual283286.141117.3622
    Total297662.887   
      Coefficients Standard Error t Stat P-value
    Intercept-9.786845.95791-1.642660.111638
    H2S5.7760890.945856.106771.37E-06

In this as we compare it with the model in part (c), we can see that the coefficient of determination or the variations explained are less in this model then in part (c) and all the slopes are statistically significant.

(e)

To determine

To explain which model best explains cheddar taste and check the conditions for inference for this model and conclude.

(e)

Expert Solution
Check Mark

Answer to Problem 28.14AYK

Model (b) best explains cheddar taste and conditions are met.

Explanation of Solution

In the question, it is given that experimenters assessed the concentration of lactic acid, acetic acid and hydrogen sulfide in thirty randomly chosen pieces of cheddar cheese. The table is given which shows the data. By looking at the model (b), (c) and (d), we can say that the variations explained is more in part (b) than in (c) and (d). Thus, the model in (b) best explains cheddar taste. The conditions for inferences are as: as we can see in the scatterplot, it shows the linearity and as we look at the data it shows the normality and constant variance by looking at the model regression analysis using Excel’s residual plot and the data is randomly selected so it shows independence. Thus, the conditions are met.

Want to see more full solutions like this?

Subscribe now to access step-by-step solutions to millions of textbook problems written by subject matter experts!
Knowledge Booster
Background pattern image
Recommended textbooks for you
Text book image
MATLAB: An Introduction with Applications
Statistics
ISBN:9781119256830
Author:Amos Gilat
Publisher:John Wiley & Sons Inc
Text book image
Probability and Statistics for Engineering and th...
Statistics
ISBN:9781305251809
Author:Jay L. Devore
Publisher:Cengage Learning
Text book image
Statistics for The Behavioral Sciences (MindTap C...
Statistics
ISBN:9781305504912
Author:Frederick J Gravetter, Larry B. Wallnau
Publisher:Cengage Learning
Text book image
Elementary Statistics: Picturing the World (7th E...
Statistics
ISBN:9780134683416
Author:Ron Larson, Betsy Farber
Publisher:PEARSON
Text book image
The Basic Practice of Statistics
Statistics
ISBN:9781319042578
Author:David S. Moore, William I. Notz, Michael A. Fligner
Publisher:W. H. Freeman
Text book image
Introduction to the Practice of Statistics
Statistics
ISBN:9781319013387
Author:David S. Moore, George P. McCabe, Bruce A. Craig
Publisher:W. H. Freeman