HWK2_Soln

.pdf

School

University of Wisconsin, Madison *

*We aren’t endorsed by this school

Course

371

Subject

Statistics

Date

May 16, 2024

Type

pdf

Pages

7

Uploaded by GrandEchidnaMaster871 on coursehero.com

Stat 371 Homework #2 Due Wednesday Feb 9th 11:59 pm *Submit your homework to Canvas by the due date and time. Email your lecturer if you have extenuating circumstances and need to request an extension. *If an exercise asks you to use R, include a copy of the code and output. Please edit your code and output to be only the relevant portions. *If a problem does not specify how to compute the answer, you many use any appropriate method. I may ask you to use R or use manually calculations on your exams, so practice accordingly. *You must include an explanation and/or intermediate calculations for an exercise to be complete. *Be sure to submit the HWK2 Auto grade Quiz which will give you ~20 of your 40 accuracy points. *50 points total: 40 points accuracy, and 10 points completion Summarizing Data Numerically and Graphically and Probability Exercise 1: A certain reaction was run several times using each of two catalysts, A and B. The catalysts are supposed to control the yield of an undesireable side product. Results, in units of percentage yield, for 25 runs of catalyst A and 23 runs of catalyst B are given below and also in Catalysts.csv. Catalyst A: 4.3, 4.4, 3.4, 2.6, 3.8, 4.9, 4.6, 5.2, 4.7, 4.1, 2.6, 6.7, 4.1, 3.6, 2.9, 2.6, 4.0, 4.3, 3.9, 4.8, 4.5, 4.4, 3.1, 5.7, 4.5 Catalyst B: 3.4, 5.9, 1.2, 2.1, 5.5, 6.4, 5.0, 5.8, 2.5, 3.7, 3.8, 5.1, 3.1, 1.6, 3.5, 5.9, 6.7, 5.2, 5.8, 2.2, 4.3, 3.8, 1.2 a. Use R to create a histogram for the percentage yield of the undesireable side product for the two catalysts (any kind of histogram that you want since sample sizes are similar). Have identical x and y axis scales so the two groups’ values are more easily compared. Include useful titles. CatA <- c( 4.3 , 4.4 , 3.4 , 2.6 , 3.8 , 4.9 , 4.6 , 5.2 , 4.7 , 4.1 , 2.6 , 6.7 , 4.1 , 3.6 , 2.9 , 2.6 , 4.0 , 4.3 , 3.9 , 4 CatB <- c( 3.4 , 5.9 , 1.2 , 2.1 , 5.5 , 6.4 , 5.0 , 5.8 , 2.5 , 3.7 , 3.8 , 5.1 , 3.1 , 1.6 , 3.5 , 5.9 , 6.7 , 5.2 , 5.8 , 2 length(CatA); length(CatB) ## [1] 25 ## [1] 23 #or to get the data: Catalysts = read.csv( "Catalysts.csv" , header= TRUE) CatA_csv = subset(Catalysts, Catalyst== "A" ) CatA_PercY = CatA_csv$PercYield CatB_csv = subset(Catalysts, Catalyst== "B" ) CatB_PercY = CatB_csv$PercYield par( mfrow= c( 1 , 2 )) 1
hist(CatA, breaks= c(seq( 0 , 7 , 1 )), main= "Catalyst A" , xlab= "Percentage Yield" , ylim= c( 0 , 15 )) hist(CatB, breaks= c(seq( 0 , 7 , 1 )), main= "Catalyst B" , xlab= "Percentage Yield" , ylim= c( 0 , 15 )) Catalyst A Percentage Yield Frequency 0 1 2 3 4 5 6 7 0 5 10 15 Catalyst B Percentage Yield Frequency 0 1 2 3 4 5 6 7 0 5 10 15 par( mfrow= c( 1 , 1 )) b. Compare the shape of the percentage yields from the two catalysts observed in this sample. Both of the histograms are roughly symmetric. Catalyst A has one primary peak between 4-5 and Catalyst B has two primary peaks (bimodal) around 3 and 5 c. Compute the mean and median percentage yields observed for Catalyst A and Catalyst B using R. Compare both measures of center within each group and comment on how that relationship corresponds to the datas’ shapes. Also compare the measures of center across the two groups and comment on how that relationship is evident in the histograms. mean(CatA); median(CatA) ## [1] 4.148 ## [1] 4.3 mean(CatA_PercY); median(CatA_PercY) ## [1] 4.148 ## [1] 4.3 mean(CatB); median(CatB) ## [1] 4.073913 ## [1] 3.8 mean(CatB_PercY); median(CatB_PercY) 2
## [1] 4.073913 ## [1] 3.8 CatA: mean: 4.148, median: 4.3. CatB: mean: 4.07, median: 3.8. For both Catalysts, we see that the mean and median values are pretty close to one another - this is consistent with the roughly symmetric shapes of the data. The centers of Cat A is slightly higher than those of Cat B which can be seen with a slight shift to the right for the Catalyst A histogram. d. Compute (in R) and compare the sample standard deviation of percentage yield from Catalyst A and Catalyst B. Comment on how the relative size of these values can be identified from the histograms. Describe in words what these values mean when considering which catalyst to use for your experiment. sd(CatA) ## [1] 0.9760123 sd(CatA_PercY) ## [1] 0.9760123 sd(CatB) ## [1] 1.715496 sd(CatB_PercY) ## [1] 1.715496 CatA SD:0.976 CatB SD:1.72 The standard deviation of the CatA percentage yields is smaller than that of Cat B. This relationship could be predicted from the histograms since most of the CatA values are near the center (~4), while most of the Cat B observations are away from its center (~4). The values of percentage yield from catalyst A look to be more predictable based on this sample- less variability in the outcome. e. Use R to create side-by-side boxplots of the two sets in R so they are easily comparable. boxplot(CatA, CatB, names= c( "Cat A" , "Cat B" ), xlab= "LABLES" , main= "TITLE" ) text( y= fivenum(CatA), labels= fivenum(CatA), x= 0.7 ) text( y= fivenum(CatB), labels= fivenum(CatB), x= 2.2 ) 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help