Stat311 Homework 7

.pdf

School

University of Washington *

*We aren’t endorsed by this school

Course

EDDD 8

Subject

Statistics

Date

May 31, 2024

Type

pdf

Pages

16

Uploaded by JusticeFlower13326

11/30/22, 9 : 41 PM Stat311 Homework 7 Page 1 of 16 file:///Users/tinasong/Downloads/Homework7Template.html Problem 1 Problem 2 Problem 3 Problem 4 Problem 5 Problem 6 Read in the ice cream, birthweight, and cholesterol data sets. Stat311 Homework 7 Tina Song 2022-11-30 Read in the ice cream, birthweight, and cholesterol data sets. Problem 1 Part 1a) “more than” The given statement is a statement about the alternative hypothesis. H0: p = 0.25 vs. Ha: p > 0.25 Part 1b) “most” The given statement is a statement about the alternative hypothesis. H0: p = 0.5 vs. Ha: p > 0.5 Part 1c) “equal to” The given statement is a statement about the null hypothesis. H0: mu = 121 vs. Ha: mu 121 Part 1d) “no more than” The given statement is a statement about the null hypothesis. H0: p 0.02 vs. Ha: p > 0.02 Code Hide IC.df <- read.csv("IceCream.csv", header=TRUE, as.is=TRUE ) IC.df$Sex <- as.factor(IC.df$Sex) IC.df$Flavor <- as.factor(IC.df$Flavor) # BW.df <- read.csv("BirthWeight.csv", header=TRUE, as.is=T RUE) BW.df$Smoker <- as.factor(BW.df$Smoker) BW.df$BirthWt <- as.factor(BW.df$BirthWt) # C.df <- read.csv("Cholesterol.csv", header=TRUE, as.is=TR UE) C.df$Cereal <- as.factor(C.df$Cereal)
11/30/22, 9 : 41 PM Stat311 Homework 7 Page 2 of 16 file:///Users/tinasong/Downloads/Homework7Template.html Part 1e) “at least” The given statement is a statement about the null hypothesis. H0: mu 0.8535 vs. Ha: mu < 0.8535 Part 1f) “better than’ The given statement is a statement about the alternative hypothesis. p1: the success rate with surgery p2: the success rate with splinting H0: p1 = p2 vs. Ha: p1 > p2 Part 1g) “greater” The given statement is a statement about the alternative hypothesis. mu1: the mean age unsuccessful job applicants mu2: the mean age of successful applications H0: mu1 = mu2 vs. Ha: mu1 > mu2 Problem 2 mu0: the student’s observed sample mean puzzle score = 52.405 “di " erent than” (alternative hypothesis) De # ne the statistical hypotheses as: H0: mu = 52.405 vs. Ha: mu 52.405 Use a 5% signi # cance level: alpha = 0.05 the t test statistic: -0.7927453 the critical value: 1.971957 Since this is a two-tailed test, t-crit is ± 1.97 by hand p- value: 0.4288703 R p-value: 0.4289 Since the p-value (=0.43) is > 0.05, we fail to reject the null hypothesis. There is no evidence that student’s population mean video score is di " erent than the student’s observed sample mean puzzle score (p = 0.43). ## mean.Puzzle SD.Puzzle ## 1 52.405 10.73579 Hide IC.df %>% summarize(mean.Puzzle = mean(IC.df$Puzzle, na.r m=TRUE), SD.Puzzle = sd(IC.df$Puz zle, na.rm=TRUE)) Hide IC.df %>% summarize(mean.Video = mean(IC.df$Video, na.rm= TRUE), SD.Video = sd(IC.df$Vide o, na.rm=TRUE))
11/30/22, 9 : 41 PM Stat311 Homework 7 Page 3 of 16 file:///Users/tinasong/Downloads/Homework7Template.html ## mean.Video SD.Video ## 1 51.85 9.900891 ## [1] -0.7927453 ## [1] 199 ## [1] 1.971957 ## [1] 0.4288703 Hide (t <- (51.85-52.405)/ (9.900891/ sqrt(200))) Hide (df <- 200 - 1) Hide (tcrit <- qt(0.975, df)) Hide (pvalue <- 2 * pt(-0.7927453, df)) Hide t.test(IC.df$Video, mu = 52.405, alpha = 0.05, alternativ e = "two.sided")
11/30/22, 9 : 41 PM Stat311 Homework 7 Page 4 of 16 file:///Users/tinasong/Downloads/Homework7Template.html ## ## One Sample t-test ## ## data: IC.df$Video ## t = -0.79275, df = 199, p-value = 0.4289 ## alternative hypothesis: true mean is not equal to 52.4 05 ## 95 percent confidence interval: ## 50.46944 53.23056 ## sample estimates: ## mean of x ## 51.85 Problem 3 Part 3a) mu1: the population mean puzzle score for students that prefer vanilla ice cream mu2: the population mean puzzle score for students that prefer chocolate ice cream Hypotheses: H0: mu1 = mu2 vs. Ha: mu1 mu2 alpha = 0.05 p-value = 0.014 Since p < 0.05, we reject the null hypothesis. There is su $ cient evidence indicating that students with a preference for vanilla ice cream have a population mean puzzle score that is di " erent than the population mean score for students that prefer chocolate ice cream (p = 0.014). Hide IC.V <- filter(IC.df, Flavor == "1") IC.C <- filter(IC.df, Flavor == "2") t.test(IC.V$Puzzle, IC.C$Puzzle, alternative = "two.sided ")
11/30/22, 9 : 41 PM Stat311 Homework 7 Page 5 of 16 file:///Users/tinasong/Downloads/Homework7Template.html ## ## Welch Two Sample t-test ## ## data: IC.V$Puzzle and IC.C$Puzzle ## t = 2.5026, df = 85.294, p-value = 0.01423 ## alternative hypothesis: true difference in means is no t equal to 0 ## 95 percent confidence interval: ## 0.9687439 8.4561161 ## sample estimates: ## mean of x mean of y ## 52.03158 47.31915 Part 3b) Assuming the same signi # cance level: alpha = 0.05 p-value: 1.987 The p- value for the permutation test is greater than the p-value (=0.014) in part(a). I do not make the same conclusion as in (a). 1.987 > 0.05 ## `summarise()` has grouped output by 'replicate'. You c an override using the ## `.groups` argument. Hide set.seed(15) IC.VC <- filter(IC.df, Flavor == "1" | Flavor == "2") PermsOut <- IC.VC %>% rep_sample_n(size = nrow(IC.VC), reps = 1000, replace = FALSE) %>% mutate(IC.VC_perm = sample(Puzzle)) %>% group_by(replicate, Flavor) %>% summarize(prop_IC.df_perm = mean(IC.VC_perm), mean_IC. df = mean(Puzzle)) %>% summarize(diff_perm = diff(prop_IC.df_perm), diff_orig = diff(mean_IC.df)) Hide PermsOut
11/30/22, 9 : 41 PM Stat311 Homework 7 Page 6 of 16 file:///Users/tinasong/Downloads/Homework7Template.html ## # A tibble: 1,000 × 3 ## replicate diff_perm diff_orig ## <int> <dbl> <dbl> ## 1 1 2.95 -4.71 ## 2 2 -0.896 -4.71 ## 3 3 0.440 -4.71 ## 4 4 -5.06 -4.71 ## 5 5 0.0580 -4.71 ## 6 6 -2.39 -4.71 ## 7 7 0.662 -4.71 ## 8 8 1.52 -4.71 ## 9 9 1.30 -4.71 ## 10 10 -5.51 -4.71 ## # … with 990 more rows ## # A tibble: 1 × 1 ## count ## <int> ## 1 1987 ## [1] 1.987 Hide (countout <- PermsOut %>% summarize(count = sum(diff_orig <= diff_perm) + sum(diff_perm <= -diff_orig))) Hide (pvalue1 <- 1987 / 1000) Hide origdiff <- PermsOut$diff_orig[1] p1 <- ggplot(data = PermsOut, aes(x = diff_perm)) + geom_histogram(bins = 13) + xlab("Puzzle scores") + geom_vline(xintercept = origdiff, col="Red") + geom_vline(xintercept = abs(origdiff), col="Red" ) p1
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help