Game theory, in economics, considers the environment with multiple agents as a “game”. This is regardless of whether the agents in the game are cooperative or competitive. In AI Solutions Company, games are considered ‘adversarial’ in nature. By adversarial we mean that the environment is two-player, turn-taking in which after the game, the utility values are equal and opposite.
These take place in a deterministic, full observable environment. An example of this is chess. The rules of chess can make the computer representation of the game correct in every relevant detail. However, the presence of an opponent makes solving more complicated because it introduces uncertainty.
Let’s consider a game of two players, and call them MIN and MAX. These two players will compete where MAX will try to maximize the payoff. A search problem in the form of a game can be defined with the following 4 components:
- Initial state – This state indicates the board position and whose move it is
- Terminal test – This test indicates when the game is over
- Set of operators – These define the possible moves that a player can make
- Payoff function / Utility function – This gives a numeric value for the outcome
In a normal search problem, the objective of MAX would be to search for a payoff value that leads to a terminal state that is a winner, thereby making its first move. However, in the presence of adversary MIN, the strategy of MAX will be to find a winning terminal state regardless of what move MIN makes.
To determine the optimal strategy for MAX, the minimax algorithm is used to determine the best first move. The 5 steps in this algorithm are:
- Generate the entire game tree till the terminal nodes.
- Apply the payoff function to each terminal state to get its value.
- Use the payoff of the terminal states to determine the payoff of the nodes one level above in the search tree.
- Continue step 3 one layer at a time till you reach the root node.
- Choose the move that leads to the highest value.
Since the decision maximizes the payoff under the assumption that the opponent will play perfectly to minimize it, it is called the minimax decision.
In the two-player game tree generated in Figure 1, the A nodes indicate the moves by MAX and the V nodes indicate those of MIN. The terminal nodes show the payoff value for MAX calculated by the rules of the game, i.e. the utility function. The payoff values of the other nodes are calculated by the minimax algorithm from the payoffs of their successors. In this case, MAX’S best move is A1, and MIN’s best reply is A11.