Assignment-2

pdf

School

McMaster University *

*We aren’t endorsed by this school

Course

2B03

Subject

Statistics

Date

Apr 3, 2024

Type

pdf

Pages

5

Report

Uploaded by MinisterAnt14343

2B03 Assignment 2 Probability Theory (Chapters 4 & 5) Matthew Musulin 400329990 Due Thursday October 7 2021 Instructions: You are to use R Markdown for generating your assignment output file. You begin with the R Markdown script downloaded from A2L, and need to pay attention to information provided via introductory material posted to A2L on working with R, R Markdown. Having downloaded all necessary files, placed them in the same folder/directory, and added your answers to the R Markdown script, you then are to generate your output file using “Knit to PDF” and, when complete, upload both your R Markdown file and your PDF file to the appropriate folder on A2L. 1. Define the following terms in a sentence (or short paragraph) and state a formula if appropriate (this question is worth 5 marks). i. Simple Random Sample A sampling method where every member in a givien population has an equal chance of being selected. ii. Bayes’ Theorem A way of revising probabilities when know information is given. It takes a prior probability and revises it to create what is called a posterior probability. iii. Mutually Exclusive Events Mutially exclusive events are events where the outcome of one means the other is not possible. An example is flipping a coin. If it lands on heads then it cannot land on tails as well. iv. General Addition Law The probability of event A or B happening equals the probability of A alone plus the probability of B alone minus the probability of A and B happening at the same time P ( A B ) = P ( A ) + P ( B ) P ( A B ) v. General Multiplication Law The joint probability of two events happening at the same timeequals the unconditional probability of one event times theconditional probability of the other event, given that the first eventhas already occurred P ( A B ) = P ( A ) P ( B | A ) = P ( B ) P ( A | B ) 2. When the American League and the National League baseball champions are evenly matched, the probabilities that a World Series will end in 4, 5, 6, or 7 games are respectively 1 / 8 , 1 / 4 , 5 / 16 , and 5 / 16 . What is the expected length of a world series when the two teams are evenly matched (hint: treat the number of games as a discrete random variable with outcome space { 4 , 5 , 6 , 7 } and known probabilities, and remember that “expectation” and “mean” of a random variable denote the same thing: this question is worth 2 marks)? 1
len = c( 4 , 5 , 6 , 7 ) prob = c( 1 / 8 , 1 / 4 , 5 / 16 , 5 / 16 ) ans = sum(prob*len) ans ## [1] 5.8125 3. Suppose that the probability of a success on a Bernoulli trial is π = 0 . 2 and is independent from trial to trial. Write down the formula for obtaining the probability of getting X = 2 successes in n = 100 trials and then, using this (or otherwise), compute p ( X = 2) (hint - independent Bernoulli trials generate random variables having a binomial distribution: this question is worth 2 marks). n = 1000 x = 2 p = 0.2 prob = choose(n,x)*(pˆx)*(( 1 -p)ˆ(n-x)) prob ## [1] 3.84063e-93 4. A market research firm goes to 12 stores and determines how much (in cents) each charges for an identical tube of travel-sized toothpaste. The resulting sample of prices (measured in cents) is as follows (ignore the ## [1] at the beginning of the line: this question is worth 4 marks): ## [1] 102 107 100 103 109 104 101 100 107 104 104 103 i. Using R (or otherwise), calculate the sample mean, median, and mode of the toothpaste prices. Mean = mean(price) Mean ## [1] 103.6667 Median = median(price) Median ## [1] 103.5 Mode = names(table(price))[table(price)==max(table(price))] Mode ## [1] "104" ii. Using R, calculate the sample variance, standard deviation, and interquartile range of the toothpaste prices. Variance = var(price) Variance ## [1] 8.060606 SD = sd(price) SD ## [1] 2.839121 IQ = IQR(price) IQ ## [1] 3 2
iii. Using the moments package in R (you must install this package first, then load it via library(moments) before you call the function skewness() etc.), does the coefficient of skewness indicate that the distribution of prices is skewed to the left or to the right? library(moments) SKEW = skewness(price) SKEW ## [1] 0.4278145 The coefficient of skewness indicates that the distribution of prices is skewed toward the right as the value of the skew is positive. iv. Using the moments package in R, does the coefficient of kurtosis indicate that the distribution of prices is more or less heavy tailed than the normal distribution? KURT = kurtosis(price) KURT ## [1] 2.253265 The coefficient of kurtosis indicates that the distribution of prices is less heavily tailed than the normal distribution. 5. Consider the two events: A = 5 or more alcoholic drinks consumed in one day last year; B = Person is female. For persons in the age group 18-24 years hold, a survey taken in early 2008 by the National Center for Health Statistics, suggest the probabilities P ( B ) = 0 . 5 (probability the person is female), P ( A ¯ B ) = 0 . 23 (probability the person consumed 5 or more drinks and was male) and P ( A B ) = 0 . 14 (probability the person consumed 5 or more drinks and was female). Note that ¯ A denotes the complement of A i.e. not A (this question is worth 5 marks). Determine the following probabilities and state in words what they measure (you must show all steps!) i. P ( A ) P ( A ¯ B ) = P ( A ) P ( A B ) 0 . 23 = P ( A ) 0 . 14 P ( A ) = 0 . 23 + 0 . 14 P ( A ) = 0 . 37 P ( A ) measures the probability that the person consumed 5 or more alcoholic drinks in one day last year and is 0.37 ii. P ( ¯ A ) P ( ¯ A ) = 1 P ( A ) P ( ¯ A ) = 1 0 . 37 P ( ¯ A ) = 0 . 63 P ( ¯ A ) measures the probability that the person did not consume 5 or more alcoholic drinks in one day last year and is 0.63 iii. P ( ¯ B ) P ( ¯ B ) = 1 P ( B ) P ( ¯ B ) = 1 0 . 5 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
P ( ¯ B ) = 0 . 5 P ( ¯ B ) measures the probability that the person is male and is 0.5 iv. P ( ¯ A B ) P ( ¯ A B ) = P ( ¯ B ) P ( A B ) P ( ¯ A B ) = 0 . 5 0 . 14 P ( ¯ A B ) = 0 . 36 P ( ¯ A B ) measures the probability that the person did not consume 5 or more alcoholic drinks and was female and is 0.36 v. P ( ¯ A ¯ B ) P ( ¯ A ¯ B ) = P ( ¯ A ) P ( ¯ A B ) P ( ¯ A ¯ B ) = 0 . 63 0 . 36 P ( ¯ A ¯ B ) = 0 . 27 P ( ¯ A ¯ B ) measures the probability that the person did not consume 5 or more alcoholic drinks and was male and is 0.27 6. Consider the binomial distribution with n = 5 and π = 0 . 7 hence the sample space for this particular discrete random variable X ∈ { 0 , 1 , . . . , 5 } (this question is worth 4 marks). i. Calculate the mean and variance of this random variable using the E ( X ) = QQQQQQQ i x i p ( X = x i ) type formulas for each (i.e., the formulas for computing the mean and variance of any discrete random variable). Mean Pr ( x = 0) = 5 0 0 . 7 0 (1 0 . 7) 5 = 0 . 00243 Pr ( x = 1) = 5 1 0 . 7 1 (1 0 . 7) 4 = 0 . 02835 Pr ( x = 2) = 5 2 0 . 7 2 (1 0 . 7) 3 = 0 . 1323 Pr ( x = 3) = 5 3 0 . 7 3 (1 0 . 7) 2 = 0 . 3087 Pr ( x = 4) = 5 4 0 . 7 4 (1 0 . 7) 1 = 0 . 36015 Pr ( x = 5) = 5 5 0 . 7 5 (1 0 . 7) 0 = 0 . 16807 Mean 3 . 5 Variance 1 . 05 4
ii. Repeat i. above using the short-cut formula (i.e., compute the mean and variance using the formulas depending only on π and n that are specific to the binomial distribution). E ( x ) = n × π E ( x ) = 5 × 0 . 7 E ( x ) = 3 . 5 V ( x ) = n × π × (1 π ) V ( x ) = 5 × 0 . 7 × (1 0 . 7) V ( x ) = 1 . 05 7. Suppose that the subjective prior probabilities of finding oil are p(oil)=0.5 and p(no oil)=0.5. Your company drills for 500ft and finds no oil, but they sample the soil at 500ft. From experience the company knows that the conditional probability of finding this type of soil given that oil is present is 0.1, while the probability of finding this type of soil given that no oil is present is 0.9. What is the posterior probability of finding oil (i.e., the probability of finding oil given that this type of soil is present)? You must show all of your work (hint - think “Reverend Thomas Bayes”, and you can use the symbols S and ¯ S for “this type of soil” and “not this type of soil”, respectively, the symbols O and ¯ O for “oil” and “no oil”, respectively, and p ( O | S ) for the posterior probability: this question is worth 2 bonus marks) 5