Database System Concepts
7th Edition
ISBN: 9780078022159
Author: Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher: McGraw-Hill Education
expand_more
expand_more
format_list_bulleted
Concept explainers
Question
Please provide the code and explanation for the following in R for One Way ANOVA with the penguins data set
a. Load the penguins data set (from the palmerpenguins library)
b. Data Summaries & Assumption Check
i. Create a new data frame that only has the columns species and bill_depth_mm
from the original penguins data frame. Remove NA values from the data
frames using na.omit()
ii. Create a single graph with 3 boxplots on the same scale, one for the bill depth
for each of the penguin species. Each boxplot should be a different color. From
this plot, are the means the same?
iii. Create a new data frame for each species with the bill depth data for that
species. How many observations are there for each species?
iv. Check the normality assumption for each subset by creating histograms and qq
plots. Make sure each plot has an appropriate title. Divide your plot region into 6 sections so you can see the histogram and qqplot for each species side by side par(mfrow=c(3,2)) will give you a 3 row 2 column setup to work with
v. What is the sample variance for each diet? Do you think that the assumption of
common variance holds? Why? How could you test this?
c. Conduct a test using one way anova to test the null hypothesis that the mean bill depth
is the same for all 3 species.
i. Define your null and alternative hypothesis
ii. Use the aov() function to conduct your test
iii. Use the summary() function to see the full details of the test.
iv. Report the degrees of freedom, sum of squares, p-value, and conclusion for your test
a. Load the penguins data set (from the palmerpenguins library)
b. Data Summaries & Assumption Check
i. Create a new data frame that only has the columns species and bill_depth_mm
from the original penguins data frame. Remove NA values from the data
frames using na.omit()
ii. Create a single graph with 3 boxplots on the same scale, one for the bill depth
for each of the penguin species. Each boxplot should be a different color. From
this plot, are the means the same?
iii. Create a new data frame for each species with the bill depth data for that
species. How many observations are there for each species?
iv. Check the normality assumption for each subset by creating histograms and qq
plots. Make sure each plot has an appropriate title. Divide your plot region into 6 sections so you can see the histogram and qqplot for each species side by side par(mfrow=c(3,2)) will give you a 3 row 2 column setup to work with
v. What is the sample variance for each diet? Do you think that the assumption of
common variance holds? Why? How could you test this?
c. Conduct a test using one way anova to test the null hypothesis that the mean bill depth
is the same for all 3 species.
i. Define your null and alternative hypothesis
ii. Use the aov() function to conduct your test
iii. Use the summary() function to see the full details of the test.
iv. Report the degrees of freedom, sum of squares, p-value, and conclusion for your test
Expert Solution
This question has been solved!
Explore an expertly crafted, step-by-step solution for a thorough understanding of key concepts.
This is a popular solution
Trending nowThis is a popular solution!
Step by stepSolved in 5 steps with 4 images
Knowledge Booster
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.Similar questions
- Explain how each of the three wrap attribute values of text areas work. View keyboard shortcutsarrow_forwardRemove first and last element from likedlist JAVAarrow_forwardUse iloc, loc, or square brackets [] to slice the 'area' column. Put the results in the variable area-values. # TODO 1 area_values = display(area_values) todo_check([ (area_values.shape == (517,),'area_values shape did not match (517,)'), (np.all(area_values.values[-10:] == np.array([0. ,0. ,2.17,0.43,0. ,6.44,54.29,11.16,0. ,0. ])),'area_values did not contain the correct values!') ])arrow_forward
Recommended textbooks for you
- Database System ConceptsComputer ScienceISBN:9780078022159Author:Abraham Silberschatz Professor, Henry F. Korth, S. SudarshanPublisher:McGraw-Hill EducationStarting Out with Python (4th Edition)Computer ScienceISBN:9780134444321Author:Tony GaddisPublisher:PEARSONDigital Fundamentals (11th Edition)Computer ScienceISBN:9780132737968Author:Thomas L. FloydPublisher:PEARSON
- C How to Program (8th Edition)Computer ScienceISBN:9780133976892Author:Paul J. Deitel, Harvey DeitelPublisher:PEARSONDatabase Systems: Design, Implementation, & Manag...Computer ScienceISBN:9781337627900Author:Carlos Coronel, Steven MorrisPublisher:Cengage LearningProgrammable Logic ControllersComputer ScienceISBN:9780073373843Author:Frank D. PetruzellaPublisher:McGraw-Hill Education
Database System Concepts
Computer Science
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:McGraw-Hill Education
Starting Out with Python (4th Edition)
Computer Science
ISBN:9780134444321
Author:Tony Gaddis
Publisher:PEARSON
Digital Fundamentals (11th Edition)
Computer Science
ISBN:9780132737968
Author:Thomas L. Floyd
Publisher:PEARSON
C How to Program (8th Edition)
Computer Science
ISBN:9780133976892
Author:Paul J. Deitel, Harvey Deitel
Publisher:PEARSON
Database Systems: Design, Implementation, & Manag...
Computer Science
ISBN:9781337627900
Author:Carlos Coronel, Steven Morris
Publisher:Cengage Learning
Programmable Logic Controllers
Computer Science
ISBN:9780073373843
Author:Frank D. Petruzella
Publisher:McGraw-Hill Education