Lab 8 Pre-Lab and Exercises
pdf
School
University of Calgary *
*We aren’t endorsed by this school
Course
217
Subject
Statistics
Date
Apr 3, 2024
Type
Pages
9
Uploaded by trist8182
11/26/2019
Statistics 213 Lab Exercises – Confidence Intervals and a Bootstrapped Sample
https://scott-robison.rstudio.cloud/a8c401512bf9446cb9e30f29e3a95885/file_show?path=%2Fcloud%2Fproject%2FLab8.nb.html
1/9
Statistics 213 Lab Exercises –
Confidence Intervals and a Bootstrapped
Sample
© Jim Stallard, Scott Robison, and Claudia Mahler 2019 all rights reserved.
Pre-Lab Exercise 1:
To determine the amount of caffeine (in milligrams) that are in a medium ‘light-roast’ cup of coffee from Good Dirt
Café, a random sample of 12 medium cups of light- roast blend were inspected over the course of a week. The
amount of caffeine in each cup was observed. The resulting data are provided.
x=c(112.8, 86.4, 45.9, 110.3, 100.3, 93.3, 101.9, 115.7, 92.5, 117.3, 105.6, 81.6) x
The mean and standard deviation of this sample were computed:
mean(x) sd(x) n=length(x) n
a. Compute 95% confidence interval for , the mean amount of caffeine in a
medium-sized cup of light-roast blend from Good Dirt Café, if you assume the
data is normally distributed
.
Code Hide
Hide
Hide
11/26/2019
Statistics 213 Lab Exercises – Confidence Intervals and a Bootstrapped Sample
https://scott-robison.rstudio.cloud/a8c401512bf9446cb9e30f29e3a95885/file_show?path=%2Fcloud%2Fproject%2FLab8.nb.html
2/9
mean(x)-qt(0.975,n-1)*sd(x)/n^.5 mean(x)+qt(0.975,n-1)*sd(x)/n^.5
Or by using tables:
96.96667-2.201*19.75262/12^.5
96.96667+2.201*19.75262/12^.5
Notice that the table gives less precision due to rounding!
Or by bootstrapping the sample:
library(mosaic) RNGkind(sample.kind="Rejection"); set.seed(1); #this makes it so that so you will get the same random sample as we get below B=do(1000) * mean(resample(x, n)); quantile(B$mean,0.025) quantile(B$mean,0.975)
Hide
Hide
11/26/2019
Statistics 213 Lab Exercises – Confidence Intervals and a Bootstrapped Sample
https://scott-robison.rstudio.cloud/a8c401512bf9446cb9e30f29e3a95885/file_show?path=%2Fcloud%2Fproject%2FLab8.nb.html
3/9
We now have produced three different confidence intervals for :
1. using the assumption that the data is normally distributed (yet we don’t know )
and then using the computer to give more digits for the T distribution than tables
would give:
95% confidence interval: 2. using the assumption that the data is normally distributed (yet we don’t know )
and then using the T tables would give:
95% confidence interval: 3. using no assumptions allowing the computer to resample (with replacement from
the sample we have, repeatedly) “Bootstrapping”:
95% confidence interval: So which interval is best?
Typically the less assumptions we make the better, so unless we know a sample is normally distributed
it is
likely the bootstrapped confidence interval is the best. However, in order to bootstrap, you must have a computer
and the raw data from the sample. Therefore, when possible, bootstrapping is a great idea but the other methods
are good too.
In this scenario it was possible to bootstrap, and we did not know the normality of the data, so let’s use the
bootstrapped confidence interval as the best option:
95% confidence interval: .
b. Now look at the lower and upper bounds of the confidence interval found in (a)
and consider the following statement: “The probability that falls between the
lower and upper bounds is about 0.95.” Is this statement true or false? Why do
you think this is true or false?
False! is either in the interval or it is not. the confidence interval is not a measure of probability
but a measure
of certainty
!
We are 95%, confident or sure that will fall between .
c. What do you think would happen to the 95% confidence interval you found in (a)
if the sample size were larger than 12 medium cups of light-roast blend? Or, if
the sample size were the same and the level of confidence was smaller, like
90%?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
11/26/2019
Statistics 213 Lab Exercises – Confidence Intervals and a Bootstrapped Sample
https://scott-robison.rstudio.cloud/a8c401512bf9446cb9e30f29e3a95885/file_show?path=%2Fcloud%2Fproject%2FLab8.nb.html
4/9
When sample size (
) goes up the interval width will go down. Since our accuracy of estimation increases with the
number of data points.
When confidence (
) goes up the interval width will increase. Since we are more confident we will have to
include more possibilities.
Pre-Lab Exercise 2:
Is the probability of getting heads for a particular coin really 0.50? You decide to flip a particular coin 100 times,
each time observing the upper-side of the coin being ‘heads’ or ‘tails’. If the upper-side shows ‘heads’, you quantify
this with a ‘1’. Otherwise, you quantify the outcome of the coin-flip with a ‘0’. After the 100 tosses, you observe 61
heads.
Find a 95% confidence interval for , the probability that this coin will show ‘heads’. From your answer, can you
say that the probability of this coin showing ‘heads’ is 0.50? Why or why not?
61/100-qnorm(0.975)*((61/100)*(1-61/100)/100)^.5
61/100+qnorm(0.975)*((61/100)*(1-61/100)/100)^.5
Or by using tables:
Hide
11/26/2019
Statistics 213 Lab Exercises – Confidence Intervals and a Bootstrapped Sample
https://scott-robison.rstudio.cloud/a8c401512bf9446cb9e30f29e3a95885/file_show?path=%2Fcloud%2Fproject%2FLab8.nb.html
5/9
61/100-1.96*((61/100)*(1-61/100)/100)^.5
61/100+1.96*((61/100)*(1-61/100)/100)^.5
Notice that the table gives less precision due to rounding!
Or by bootstrapping the sample:
Hide
Hide
11/26/2019
Statistics 213 Lab Exercises – Confidence Intervals and a Bootstrapped Sample
https://scott-robison.rstudio.cloud/a8c401512bf9446cb9e30f29e3a95885/file_show?path=%2Fcloud%2Fproject%2FLab8.nb.html
6/9
library(mosaic) RNGkind(sample.kind="Rejection"); set.seed(1); #this makes it so that so you will get the same random sample as we get below B=do(1000) * mean(resample(c(rep(1,61),rep(0,100-61)), 100)); quantile(B$mean,0.025) quantile(B$mean,0.975)
We now have produced three different confidence intervals for :
1. using the assumption that the data is binomial distributed and then using the
computer to give more digits for the Z distribution than tables would give:
95% confidence interval: 2. using the assumption that the data is Binomial distributed and then using the Z
tables would give:
95% confidence interval: 3. using no assumptions allowing the computer to resample (with replacement from
the sample we have, repeatedly) “Bootstrapping”:
95% confidence interval: So which interval is best? Typically the less assumptions we make the better, so unless we know a sample is
normally distributed
it is likely the Bootstrapped confidence interval is the best. However, to bootstrap you must
have a computer. Therefore, when possible bootstrapping is a great idea but the other methods are good too.
Since does not fall between the bounds of the confidence interval (95% confidence ) it does
not
appear that this coin is “fair” (flips heads or tails 50% of the time).
Lab Exercise 1:
A random sample of 16 flights offered by a certain national air carrier is taken. For each flight chosen, the minutes
each flight was delayed was observed. The flight delay is defined as the difference between the time the plane
was scheduled to pull away from the jet way and the actual time the plane pulls away from the jet way (with
positive values indicating that the flight is late). For now, assume the flight-delay variable is normally distributed
.
Data:
a. Analyze the data by copying and pasting it into R-Studio.
Hide
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
11/26/2019
Statistics 213 Lab Exercises – Confidence Intervals and a Bootstrapped Sample
https://scott-robison.rstudio.cloud/a8c401512bf9446cb9e30f29e3a95885/file_show?path=%2Fcloud%2Fproject%2FLab8.nb.html
7/9
x=c(0, 1, -6, -6, 157, -3, 178, -3, -10, 42, -2, 120, 5, 59, 0, -2) Compute the value of the sample mean and the sample standard deviation.
b. Using R Studio, find the -multiplier needed for the -version of the confidence
interval for the population mean .
Calculate the 95% confidence interval for using this -multiplier.
c. Now consider a 99% bootstrapped confidence for Before doing any
computation, do you expect this confidence interval to be wider, narrower, or to
have the same width as the 95% confidence interval you computed in part b?
d. Copy and paste the following code into R Studio to bootstrap the sample:
library(mosaic) RNGkind(sample.kind="Rejection"); set.seed(1);#so you will get the "same" random sampling as me B=do(1000) * mean(resample(c(0, 1, -6, -6, 157, -3, 178, -3, -10, 42, -2, 120, 5, 59, 0, -2), 16
)); Compute a 99% bootstrap confidence interval for by copying the following two
pieces of code into R Studio.
Hide
11/26/2019
Statistics 213 Lab Exercises – Confidence Intervals and a Bootstrapped Sample
https://scott-robison.rstudio.cloud/a8c401512bf9446cb9e30f29e3a95885/file_show?path=%2Fcloud%2Fproject%2FLab8.nb.html
8/9
quantile(B$mean,0.005); quantile(B$mean,0.995); Lab Exercise 2:
The following is data that resulted from a cluster sample of 109 students taking Statistics 213. Each student was
asked if they support differential tuition fees. If a student did support differential fees, their response was coded
with a “1”. A “non support” was coded with a “0”. The data is as follows:
0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1,
1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1,
1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1
a. Use R Studio to find the number of Statistics 213 students in this sample who
“support” differential tuition fees, which you will record as the observed value of
the random variable , a binomially distributed random variable. Rather than
simply counting the number of ’1’s, follow the R Studio steps:
x=c(0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1); sum(x); length(x); b. Compute the 97% confidence interval for , the proportion of all Statistics 213
students who support differential tuition fees (when data follows a binomial
distribution).
c. Use R Studio to compute a 97% bootstrapped confidence interval.
Hide
Hide
Hide
11/26/2019
Statistics 213 Lab Exercises – Confidence Intervals and a Bootstrapped Sample
https://scott-robison.rstudio.cloud/a8c401512bf9446cb9e30f29e3a95885/file_show?path=%2Fcloud%2Fproject%2FLab8.nb.html
9/9
library(mosaic) RNGkind(sample.kind="Rejection"); set.seed(1);#so you will get the "same" random sampling as me B=do(1000) * mean(resample(c(rep(1,55),rep(0,109-55)), 109)); quantile(B$mean,0.015); quantile(B$mean,0.985);
d. A figure recently quoted by an executive of the Student’s Union (SU) was that
30% of all U of C students support differential tuition fees. From your finding in
parts (b) and (c), is this figure supported?
Use the skills you have learned in this lab to complete the lab quiz.
Hide
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Related Documents
Related Questions
Give an statistical principle example for a surgeon
arrow_forward
Project 4 - Correlation
Assume that your sample(s) comes from a normally distributed population.
Please show all work. State the null hypothesis and alternative hypothesis. State your decision in terms of the null hypothesis and the alpha level. State your results in standard APA format.
1. What, if any relationship, exists between self-reported height for females and self-reported weight for the same female participants?
Write a null and an alternative hypothesis for this research question (you may want to review Example 15.7 and Example 15.8 on page 508).
Determine the relationship between the two measures for this sample (calculate r).
Discuss and interpret the relation strength and direction indicated by this statistic.
Test your hypothesis.
Report your decision. Then, report your results using standard APA format (see page 509 in your text).
2. Is there a relationship between self-reported major and self-reported IQ in the sample?
Write a null and an alternative…
arrow_forward
Define Test a correlation for significance?
arrow_forward
Describe what an unbiased estimator is and give an example of an unbiased estimator and a biased estimator.
arrow_forward
A better explanation of ANOVA and correlation
arrow_forward
Please explain why the correct answers pre-selected are correct and why the ones that are not selected are incorrect.
Thanks.
arrow_forward
Help
arrow_forward
Please answer the second part of the question. How should this value be related to the confidence level?
arrow_forward
you are going to think of a study that can be analyzed with an ANOVA.This is your chance to create your own examples that relate to your interests/careers. Ex: Sociology
EX provided:
Hypothesis: dog breed and amount of sleep: bulldogs will sleep longer than both chihuahuas and golden retrievers, there will be no difference in the amount of sleep between chihuahuas and golden retrievers Independent Variable: brand (levels: bulldogs, chihuahuas, golden retrievers) notice the independent variable has 3 levels Dependent Variable: how many hours the dogs sleep
arrow_forward
Describe why a researcher would use ANOVA. and
Design a research idea that would require ANOVA to compute the data
arrow_forward
SEE MORE QUESTIONS
Recommended textbooks for you

Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill

Big Ideas Math A Bridge To Success Algebra 1: Stu...
Algebra
ISBN:9781680331141
Author:HOUGHTON MIFFLIN HARCOURT
Publisher:Houghton Mifflin Harcourt

Holt Mcdougal Larson Pre-algebra: Student Edition...
Algebra
ISBN:9780547587776
Author:HOLT MCDOUGAL
Publisher:HOLT MCDOUGAL
Related Questions
- Give an statistical principle example for a surgeonarrow_forwardProject 4 - Correlation Assume that your sample(s) comes from a normally distributed population. Please show all work. State the null hypothesis and alternative hypothesis. State your decision in terms of the null hypothesis and the alpha level. State your results in standard APA format. 1. What, if any relationship, exists between self-reported height for females and self-reported weight for the same female participants? Write a null and an alternative hypothesis for this research question (you may want to review Example 15.7 and Example 15.8 on page 508). Determine the relationship between the two measures for this sample (calculate r). Discuss and interpret the relation strength and direction indicated by this statistic. Test your hypothesis. Report your decision. Then, report your results using standard APA format (see page 509 in your text). 2. Is there a relationship between self-reported major and self-reported IQ in the sample? Write a null and an alternative…arrow_forwardDefine Test a correlation for significance?arrow_forward
- Describe what an unbiased estimator is and give an example of an unbiased estimator and a biased estimator.arrow_forwardA better explanation of ANOVA and correlationarrow_forwardPlease explain why the correct answers pre-selected are correct and why the ones that are not selected are incorrect. Thanks.arrow_forward
- Helparrow_forwardPlease answer the second part of the question. How should this value be related to the confidence level?arrow_forwardyou are going to think of a study that can be analyzed with an ANOVA.This is your chance to create your own examples that relate to your interests/careers. Ex: Sociology EX provided: Hypothesis: dog breed and amount of sleep: bulldogs will sleep longer than both chihuahuas and golden retrievers, there will be no difference in the amount of sleep between chihuahuas and golden retrievers Independent Variable: brand (levels: bulldogs, chihuahuas, golden retrievers) notice the independent variable has 3 levels Dependent Variable: how many hours the dogs sleeparrow_forward
arrow_back_ios
arrow_forward_ios
Recommended textbooks for you
- Glencoe Algebra 1, Student Edition, 9780079039897...AlgebraISBN:9780079039897Author:CarterPublisher:McGraw HillBig Ideas Math A Bridge To Success Algebra 1: Stu...AlgebraISBN:9781680331141Author:HOUGHTON MIFFLIN HARCOURTPublisher:Houghton Mifflin HarcourtHolt Mcdougal Larson Pre-algebra: Student Edition...AlgebraISBN:9780547587776Author:HOLT MCDOUGALPublisher:HOLT MCDOUGAL

Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill

Big Ideas Math A Bridge To Success Algebra 1: Stu...
Algebra
ISBN:9781680331141
Author:HOUGHTON MIFFLIN HARCOURT
Publisher:Houghton Mifflin Harcourt

Holt Mcdougal Larson Pre-algebra: Student Edition...
Algebra
ISBN:9780547587776
Author:HOLT MCDOUGAL
Publisher:HOLT MCDOUGAL