Machine Learning is the science of learning from experience. Suppose Alice is repeatedly doing an experiment. In each experiment, she tosses n coins. She does this experiment m times. In the first round, x1 coins yielded a head and y1 coins yielded a tail. Notice that, x1 + y1 = n. In the second round, x2 coins yielded a head and y2 coins yielded a tail. Once again, x2 + y2 = n. She does this experiment m times. Your job is to estimate the probability p of a coin yielding a head. 1. What is your guess on the value of p? 2. In Maximum Likelihood Estimation, we want to find a parameter p that maximizes all the observations in the dataset. If the dataset is a matrix A, where each row a1, a2, · · · , am are individual observations, we want to maximize P(A) = P(a1)P(a2)· · · P(am) because individual experiments are independent. Maximizing this is equivalent to maximizing log P(A) = log P(a1) + log P(a2) +· · ·+ log P(am). Maximizing this quantity is equivalent to minimizing the − log P(A) = − log P(a1) − log P(a2) − · · · − log P(am). 3. Here you need to find out P(ai) for yourself.
Please provide step-by-step solution for the following:
Machine Learning is the science of learning from experience. Suppose Alice is repeatedly doing an
experiment. In each experiment, she tosses n coins. She does this experiment m times. In the first
round, x1 coins yielded a head and y1 coins yielded a tail. Notice that, x1 + y1 = n. In the second
round, x2 coins yielded a head and y2 coins yielded a tail. Once again, x2 + y2 = n. She does this
experiment m times. Your job is to estimate the probability p of a coin yielding a head.
1. What is your guess on the value of p?
2. In Maximum Likelihood Estimation, we want to find a parameter p that maximizes all the
observations in the dataset. If the dataset is a matrix A, where each row a1, a2, · · · , am are
individual observations, we want to maximize P(A) = P(a1)P(a2)· · · P(am) because individual experiments are independent. Maximizing this is equivalent to maximizing log P(A) =
log P(a1) + log P(a2) +· · ·+ log P(am). Maximizing this quantity is equivalent to minimizing the
− log P(A) = − log P(a1) − log P(a2) − · · · − log P(am).
3. Here you need to find out P(ai) for yourself.
![1st to 2nd to 3rd to 4th to 5th to 6th to 7th toss
T
H
H
H
H
HT
H
T H
H
T T
H
H
T
H
T H
T
HH
T T
TT H
HT T T
H
H
T
T
HT
T
T
T
T
T
H
T
T
T T H H T T](/v2/_next/image?url=https%3A%2F%2Fcontent.bartleby.com%2Fqna-images%2Fquestion%2F83c13de9-775f-43a6-a34c-0ec7e6de8b0b%2Fc720852d-dfc7-4ab0-a282-b8edec3478b1%2Fbin85xd_processed.png&w=3840&q=75)
![](/static/compass_v2/shared-icons/check-mark.png)
Step by step
Solved in 3 steps with 33 images
![Blurred answer](/static/compass_v2/solution-images/blurred-answer.jpg)
![Algebra: Structure And Method, Book 1](https://www.bartleby.com/isbn_cover_images/9780395977224/9780395977224_smallCoverImage.gif)
![Holt Mcdougal Larson Pre-algebra: Student Edition…](https://www.bartleby.com/isbn_cover_images/9780547587776/9780547587776_smallCoverImage.jpg)
![Glencoe Algebra 1, Student Edition, 9780079039897…](https://www.bartleby.com/isbn_cover_images/9780079039897/9780079039897_smallCoverImage.jpg)
![Algebra: Structure And Method, Book 1](https://www.bartleby.com/isbn_cover_images/9780395977224/9780395977224_smallCoverImage.gif)
![Holt Mcdougal Larson Pre-algebra: Student Edition…](https://www.bartleby.com/isbn_cover_images/9780547587776/9780547587776_smallCoverImage.jpg)
![Glencoe Algebra 1, Student Edition, 9780079039897…](https://www.bartleby.com/isbn_cover_images/9780079039897/9780079039897_smallCoverImage.jpg)
![Mathematics For Machine Technology](https://www.bartleby.com/isbn_cover_images/9781337798310/9781337798310_smallCoverImage.jpg)
![Elementary Geometry For College Students, 7e](https://www.bartleby.com/isbn_cover_images/9781337614085/9781337614085_smallCoverImage.jpg)
![Elementary Geometry for College Students](https://www.bartleby.com/isbn_cover_images/9781285195698/9781285195698_smallCoverImage.gif)