Both problems are about the game of 2 player PIG where both players are racing to be the first to Naquares points. Player 1 goes first. Assume both players are trying to maximize the chance they win the game. (e.g. they get no credit for coming in second even if they are close). We will use the ordinary 6 sided dice version of PIG (a “I" is a bust, and you advance by 2,3,4,5) Problem 1 - Optimal Strategy in 2 Player Pig a) Model the game as a Markov Decision Process. What is the state space? What are the possible actions? What are the transition probabilities? What are the rewards? b) When both players play optimally, the winning probability for Player 1 from a given state is closely related to the winning probability for Player 2 a given state. What is this relationship? c) Using the answer from parts a) and b) to setup a value iteration algorithm to find the optimal strategy in 2 player PIG. Your program should take as input Nquares and return the winning probability from any given state and the optimal action from any given state. 2 Problem 2 - Playing PIG against the enemy You are playing 2 player PIG against a particular adversary, who we will refer to as "the enemy". The enemies strategy is not known, but you have a chance to play against them as many times as you like. (This is given to you as a function, enemy_outcome, that returns the outcome of one turn of the enemies play. This function is random since the outcome also depends on enemy dice rolls) a) Implement the SARSA algorithm to find the optimal strategy to beat the enemy. Your program should take as input Naquares and the enemy_outcome function and return the winning probability from any given state and the opti- mal action from any given state. b) Explain why the output in Problem 2a) could be a different strategy than the optimal strategy you found in Problem 1lc). Explain what the word "optimal" means in Problem 1 and compare/contrast that to what "optimal" means in Problem 2.

Both problems are about the game of 2 player PIG where both players are racing to be the first to Naquares points. Player 1 goes first. Assume both players are trying to maximize the chance they win the game. (e.g. they get no credit for coming in second even if they are close). We will use the ordinary 6 sided dice version of PIG (a “I" is a bust, and you advance by 2,3,4,5) Problem 1 - Optimal Strategy in 2 Player Pig a) Model the game as a Markov Decision Process. What is the state space? What are the possible actions? What are the transition probabilities? What are the rewards? b) When both players play optimally, the winning probability for Player 1 from a given state is closely related to the winning probability for Player 2 a given state. What is this relationship? c) Using the answer from parts a) and b) to setup a value iteration algorithm to find the optimal strategy in 2 player PIG. Your program should take as input Nquares and return the winning probability from any given state and the optimal action from any given state. 2 Problem 2 - Playing PIG against the enemy You are playing 2 player PIG against a particular adversary, who we will refer to as "the enemy". The enemies strategy is not known, but you have a chance to play against them as many times as you like. (This is given to you as a function, enemy_outcome, that returns the outcome of one turn of the enemies play. This function is random since the outcome also depends on enemy dice rolls) a) Implement the SARSA algorithm to find the optimal strategy to beat the enemy. Your program should take as input Naquares and the enemy_outcome function and return the winning probability from any given state and the opti- mal action from any given state. b) Explain why the output in Problem 2a) could be a different strategy than the optimal strategy you found in Problem 1lc). Explain what the word "optimal" means in Problem 1 and compare/contrast that to what "optimal" means in Problem 2.

Operations Research : Applications and Algorithms

4th Edition

ISBN:9780534380588

Author:Wayne L. Winston

Publisher:Wayne L. Winston

Chapter21: Simulation

Section21.4: An Example Of Monte Carlo Simulation

Problem 3P

See similar textbooks