Lab-9_v1

.pdf

School

Michigan State University *

*We aren’t endorsed by this school

Course

201

Subject

Statistics

Date

Jun 8, 2024

Type

pdf

Pages

Uploaded by CommodoreFang14458

Lab9_Instruction Shuting Sun 2024-01-30 Today’s lab will explore the sampling distribution of the sample proportion ࠵?̂ and construct normal theory confidence intervals (CIs) for the population proportion p. This material is Sections 9.4 and 10.2 of the text. Note that the lines with preceding ‘#’ are comments and are meant adding additional details about the code [A - B] We should find in the case (࠵? = 100, ࠵? = .30) that the sampling distribution is approximately normal and, by experimentation, that the normal theory confidence interval ࠵?̂ ± 1.96/࠵?̂(1 − ࠵?̂)/100 delivers nearly the advertised 95% coverage probability. [C - D] We should find that in another case (࠵? = 100, ࠵? = .04) that the sample proportion ࠵?̂ does not follow a bell-shaped curve. By experimentation, we should find that the normal theory confidence interval ࠵?̂ ± 1.96/࠵?̂(1 − ࠵?̂)/100 does not deliver the advertised 95% coverage probability in this case. A. You are given a Bernoulli population (population of successes and failures) with ࠵? = .30 . You are to determine the sampling distribution of the sample proportion ࠵?̂ based on a random sample of size ࠵? = 100 . Now create a file for this lab and use the following code. (Go back to instruction for lab1 if you don’t know how to create a file.) #x <- seq.int(from=0, to=n, by=1): This creates a sequence of integers from 0 to n (100 in this case), incrementing by 1. # P_phat <- dbinom(x, size=n, prob=p): This line calculates the probability of observing each number of successes (as specified in x) in a binomial distribution with n trials and a success probability of p. # Cum_Prob <- pbinom(x, size=n, prob=p): This calculates the cumulative probability for the binomial distribution. For each value in x, pbinom gives the probability of getting that many or fewer successes. (Details can be read when you type “?pbinom” in console) n <- 100 p <- . 3 x <- seq.int ( from= 0 , to= n, by= 1 ) phat <- x / n P_phat <- dbinom (x, size= n, prob= p) Cum_Prob <- pbinom (x, size= n, prob= p) Use Rstudio to plot P_phat (y-axis) vs. phat (x-axis) and examine the plot. #Q1 plot (phat, P_phat, type= "l" , col= "blue" , xlab= "phat" , ylab= "P_phat" ,

main= "Scatterplot of P_phat vs phat" ) points (phat, P_phat, pch= 20 , col= "red" ) #Q2 p #Q3 (p * ( 1 - p) / n) ** . 5 # **.5: This is the square root function in R. 1. Is the pattern of probabilities approximately bell-shaped? ________________ 2. Determine the mean of the sampling distribution of ࠵?̂ , p= ______________ 3. Calculate the standard deviation of the sampling distribution of ࠵?̂ , 3 !(#$!) #&& =_________ Keep at least 3 decimals in your answer. #pnorm is a function that calculates the cumulative distribution function (CDF) for a normal distribution. (Details can be read when you type “?pnorm” in console) #Q4 pnorm (. 34 , mean= p, sd= (p * ( 1 - p) / n) ** . 5 ) #Q5 pnorm (. 26 , mean= p, sd= (p * ( 1 - p) / n) ** . 5 ) #Q6 approx <- pnorm (. 34 , mean= p, sd= (p * ( 1 - p) / n) ** . 5 ) - pnorm (. 26 , mean= p, sd= (p * ( 1 - p) / n) ** . 5 ) approx #Q7 exact <- pbinom ( 34 , size= n, prob= p) - pbinom ( 25 , size= n, prob= p) exact 4. Record the cumulative probability for .34 here_________. Keep 4 decimals. 5. Repeat with .26 as an input constant and record the cumulative probability ____________. Keep 4 decimals. 6. Subtract the cumulative probabilities ____________________. Keep 4 decimals. This gives the normal approximation to ࠵?(. 26 ≤ ࠵?̂ ≤ .34) . 7. Exact ࠵?(. 26 ≤ ࠵?̂ ≤ .34) =_____________ Keep 4 decimals. Summarize the results from questions 6 and 7 : Normal Approximation ______ Exact Binomial ____ Find the error of approximation as the absolute value of the difference between the exact value and the approximate value: #Q8 error <- exact - approx error

#Q9 error / exact * 100 8. Error = exact value minus approximate value=_____________. Keep 4 decimals. 9. Find the relative error as [(error)/(exact value)]*100%=___________. Enter the numerical value into LON-CAPA without % symbol. B. We are now about to conduct an experiment where 1,000 random samples of size ࠵? = 100 are taken from a Bernoulli population with ࠵? = .30 . For each sample we will compute the sample proportion ࠵?̂ and the confidence limits for the normal theory 95% confidence interval estimate of p. We will then count how many of the intervals covered the population parameter ࠵? = .30 . Copy the values xgen1, pgen1, LCL1 and UCL1 from your first 10 CI’s into the columns 2 - 5 of Table 1 below and indicate in column 6 whether the CI covers the population rate ࠵? = .30 , i.e. whether 0.30 is between LCL1 and UCL1. Use two decimal places for pgen1 and three decimal places for LCL1 and UCL1. #Q10-14 simulation <- 1000 xgen1 <- rbinom (simulation, size= n, prob= p) pgen1 <- xgen1 / n LCL1 <- pgen1 - 1.96 * (pgen1 * ( 1 - pgen1) / n) ** . 5 UCL1 <- pgen1 + 1.96 * (pgen1 * ( 1 - pgen1) / n) ** . 5 cover_pt_3 <- (p > LCL1) & (p < UCL1) #Cover .30?(Yes or No) Table_1 <- data.frame ( Sample= seq ( 1 , 10 ), xgen1 = xgen1[ 1 : 10 ], pgen1 = pgen1[ 1 : 10 ], LCL1 = LCL1[ 1 : 10 ], UCL1 = UCL1[ 1 : 10 ], Cover_pt_3 = cover_pt_3[ 1 : 10 ], row.names = NULL ) Table_1 #Read the table and answer question 10-14 For the first row of the Table 1, enter the values into LON-CAPA: 10. xgen1= ________, 11. pgen1=_________, 12. LCL1=________ 13. UCL1=________ 14. Out of the first 10 intervals recorded in Table 1, how many cover .30?______________ (hint:count the number of TRUE) Now we will check all 1,000 samples to see in how many cases the CI covers ࠵? = .30 . Express the result as a percentage out of 1,000. For example, 945 is 94.5%. #Q15 sum (cover_pt_3) / simulation * 100 15. Coverage=_______________ For LON-CAPA submission, enter a number between 0 and 100 without % symbol.

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version