Both problems are about the game of 2 player PIG where both players are racing to be the first to Naquares points. Player 1 goes first. Assume both players are trying to maximize the chance they win the game. (e.g. they get no credit for coming in second even if they are close). We will use the ordinary 6 sided dice version of PIG (a “I" is a bust, and you advance by 2,3,4,5) Problem 1 - Optimal Strategy in 2 Player Pig a) Model the game as a Markov Decision Process. What is the state space? What are the possible actions? What are the transition probabilities? What are the rewards? b) When both players play optimally, the winning probability for Player 1 from a given state is closely related to the winning probability for Player 2 a given state. What is this relationship? c) Using the answer from parts a) and b) to setup a value iteration algorithm to find the optimal strategy in 2 player PIG. Your program should take as input Nquares and return the winning probability from any given state and the optimal action from any given state. 2 Problem 2 - Playing PIG against the enemy You are playing 2 player PIG against a particular adversary, who we will refer to as "the enemy". The enemies strategy is not known, but you have a chance to play against them as many times as you like. (This is given to you as a function, enemy_outcome, that returns the outcome of one turn of the enemies play. This function is random since the outcome also depends on enemy dice rolls) a) Implement the SARSA algorithm to find the optimal strategy to beat the enemy. Your program should take as input Naquares and the enemy_outcome function and return the winning probability from any given state and the opti- mal action from any given state. b) Explain why the output in Problem 2a) could be a different strategy than the optimal strategy you found in Problem 1lc). Explain what the word "optimal" means in Problem 1 and compare/contrast that to what "optimal" means in Problem 2.

Operations Research : Applications and Algorithms
4th Edition
ISBN:9780534380588
Author:Wayne L. Winston
Publisher:Wayne L. Winston
Chapter21: Simulation
Section21.4: An Example Of Monte Carlo Simulation
Problem 3P
icon
Related questions
Question

Problem 2 part a 

Both problems are about the game of 2 player PIG where both players are
racing to be the first to Naquares points. Player 1 goes first. Assume both
players are trying to maximize the chance they win the game. (e.g. they get no
credit for coming in second even if they are close). We will use the ordinary 6
sided dice version of PIG (a "I" is a bust, and you advance by 2,3,4,5)
1 Problem 1 - Optimal Strategy in 2 Player Pig
a) Model the game as a Markov Decision Process. What is the state space?
What are the possible actions? What are the transition probabilities? What
are the rewards?
b) When both players play optimally, the winning probability for Player 1
from a given state is closely related to the winning probability for Player 2 a
given state. What is this relationship?
c) Using the answer from parts a) and b) to setup a value iteration algorithm
to find the optimal strategy in 2 player PIG. Your program should take as
input Naquares and return the winning probability from any given state and the
optimal action from any given state.
2 Problem 2 - Playing PIG against the enemy
You are playing 2 player PIG against a particular adversary, who we will refer
to as "the enemy". The enemies strategy is not known, but you have a chance
to play against them as many times as you like. (This is given to you as a
function, enemy_outcome, that returns the outcome of one turn of the enemies
play. This function is random since the outcome also depends on enemy dice
rolls)
a) Implement the SARSA algorithm to find the optimal strategy to beat the
enemy. Your program should take as input Naquares and the enemy_outcome
function and return the winning probability from any given state and the opti-
mal action from any given state.
b) Explain why the output in Problem 2a) could be a different strategy
than the optimal strategy you found in Problem 1lc). Explain what the word
"optimal" means in Problem 1 and compare/contrast that to what "optimal"
means in Problem 2.
Transcribed Image Text:Both problems are about the game of 2 player PIG where both players are racing to be the first to Naquares points. Player 1 goes first. Assume both players are trying to maximize the chance they win the game. (e.g. they get no credit for coming in second even if they are close). We will use the ordinary 6 sided dice version of PIG (a "I" is a bust, and you advance by 2,3,4,5) 1 Problem 1 - Optimal Strategy in 2 Player Pig a) Model the game as a Markov Decision Process. What is the state space? What are the possible actions? What are the transition probabilities? What are the rewards? b) When both players play optimally, the winning probability for Player 1 from a given state is closely related to the winning probability for Player 2 a given state. What is this relationship? c) Using the answer from parts a) and b) to setup a value iteration algorithm to find the optimal strategy in 2 player PIG. Your program should take as input Naquares and return the winning probability from any given state and the optimal action from any given state. 2 Problem 2 - Playing PIG against the enemy You are playing 2 player PIG against a particular adversary, who we will refer to as "the enemy". The enemies strategy is not known, but you have a chance to play against them as many times as you like. (This is given to you as a function, enemy_outcome, that returns the outcome of one turn of the enemies play. This function is random since the outcome also depends on enemy dice rolls) a) Implement the SARSA algorithm to find the optimal strategy to beat the enemy. Your program should take as input Naquares and the enemy_outcome function and return the winning probability from any given state and the opti- mal action from any given state. b) Explain why the output in Problem 2a) could be a different strategy than the optimal strategy you found in Problem 1lc). Explain what the word "optimal" means in Problem 1 and compare/contrast that to what "optimal" means in Problem 2.
Expert Solution
steps

Step by step

Solved in 3 steps with 4 images

Blurred answer
Knowledge Booster
Single source shortest path
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.
Similar questions
  • SEE MORE QUESTIONS
Recommended textbooks for you
Operations Research : Applications and Algorithms
Operations Research : Applications and Algorithms
Computer Science
ISBN:
9780534380588
Author:
Wayne L. Winston
Publisher:
Brooks Cole