I'm back at min-max for a short question: How do you search the possible sequences of moves in a game?
well, there are two ways: one is, you copy the game state to try different plays. this guarantees you that you don't accidetially manipulate the game, even though you only wanted to test some moves for the computer. it has two downsides: one, you have to take some preparations so that your game state is copyable. magic is a complicated game, and it's easy to forget things. two, it takes up space. while this sounds not too much a problem nowadays, java by default starts with 32MB of memory. I did a test, and that didn't even fit the full game tree of the way easier game of quarto, which I presented earlier.
the second way is to enable undo. you work with the original game, make moves and undo them after running the evaluation function. memory is not the problem here, but the first one stays. an undo function is very complicated to implement, because magic has so many facettes.
I think that the second aproach is the better one. While it may take more work, you will want an undo function at some point, and in addition it implicitly allows some cool stuff: given the initial game state (the order of libraries), you can duplicate a game by applying the same actions to that state. this allows for storing a game state as a file, replaying awesome matches, and easily implement gaming over the network by simply transmitting an action.
Showing posts with label Min-Max. Show all posts
Showing posts with label Min-Max. Show all posts
Sunday, November 29, 2009
Creating a state tree
Monday, November 9, 2009
Performance of min-max
The min-max algorithm builds a tree of all the possible moves game states and then determines the optimal play. this sounds very expensive, and it is. to emphasize this look at this (I don't optimize anything like symmetric or rotated board positions, and don't terminate before the board is full):
tic-tac-toe, from my previous example, has one type of pieces per player and nine fields. the first move can therefore be one of nine possibilities. The second move has only eight fields free, and so on.
the total number of games is therefore 9*8*...*2*1 = 9! = 362880
Quarto is a "little" more complicated:
the total number of games is therefore (16!)² = 437763136697395052544000000 ~ 437*10²⁴
i had to use a dedicated mathematical program to calculate this, because regular java 64 bit longs weren't enough!
you see, creating every possible move is almost impossible for a game like quarto, and Magic doesn't even have a finite game tree. (for example, if both players have necropotence or something, both players can indefinitely long pass the turn). so how can you apply min-max to magic?
by introducing a reward function: instead of looking forward until the end of a game and saying -1, 0 or 1, you evaluate a board state and assign it some value between -1 and 1.
This is the real art: The reward function will directly influence your AI's performance. The better it evaluates the board, the more accurate the AI will behave.
tic-tac-toe, from my previous example, has one type of pieces per player and nine fields. the first move can therefore be one of nine possibilities. The second move has only eight fields free, and so on.
the total number of games is therefore 9*8*...*2*1 = 9! = 362880
Quarto is a "little" more complicated:
- there are sixteen pieces with four characteristics:
- small/tall
- light/dark
- hollow/massive
- square/round
- there are sixteen fields, arranged in a square
- for his move, a player may take any piece and place it anywhere on the board
- a player wins if his move creates a line of four pieces that share at least one characteristic. Lines can be in any direction, including diagonally
the total number of games is therefore (16!)² = 437763136697395052544000000 ~ 437*10²⁴
i had to use a dedicated mathematical program to calculate this, because regular java 64 bit longs weren't enough!
you see, creating every possible move is almost impossible for a game like quarto, and Magic doesn't even have a finite game tree. (for example, if both players have necropotence or something, both players can indefinitely long pass the turn). so how can you apply min-max to magic?
by introducing a reward function: instead of looking forward until the end of a game and saying -1, 0 or 1, you evaluate a board state and assign it some value between -1 and 1.
This is the real art: The reward function will directly influence your AI's performance. The better it evaluates the board, the more accurate the AI will behave.
The Min-Max Algorithm
I like to say "AI is an interesting topic, as long as you don't have to implement it". I think this is very true, because AI is obviously very interesting, but implementing some "intelligent" algorithm may be anything between annoying, boring, monotonic and challenging, hard.
The Min-Max algorithm builds on this though: Every player takes the move that is the most beneficial to him, and every game state is measurable somehow. The easiest move to measure is one that ends the game: The winning player has a "reward" of 1 (maximum) and the losing player has a reward of -1 (minimum).
You see here the end of a tic tac toe game. Green (and the one yellow) are your (circle's) turns, orange is the opponent.
First, you start building the decision tree from the initial state to the final states. Then, you determine the outcomes of the final states. Most of them are draws in this example, meaning no reward for anybody. Two of them are losses and are assigned rewards of -1.
Now, you go back through the tree to the predecessors of the bottommost states. Since you made the move, you maximize your reward. For example, the maximum reward for the yellow node's predecessor is 0, so that node is assigned 0, too.
The next time, it's your opponent's turn. Stepping up one node again, that node has two predecessors, one with 0 and one with -1. Your opponent minimizes your reward, so the node is assigned the minimum of possible child nodes, -1.
We finally reached the original turn where you have three choices, one leading to a draw and two to a loss. You maximize your own reward and choose the option that leads to a draw, which was the final decision made by the min-max algorithm. At this time you know that the game will be a draw if the opponent has the same thoughts as you. It's tic tac toe after all!
that's the basic thought behind min-max, but i guess you see how this is not applicable to Magic. Check back for the details!
The Min-Max algorithm builds on this though: Every player takes the move that is the most beneficial to him, and every game state is measurable somehow. The easiest move to measure is one that ends the game: The winning player has a "reward" of 1 (maximum) and the losing player has a reward of -1 (minimum).
First, you start building the decision tree from the initial state to the final states. Then, you determine the outcomes of the final states. Most of them are draws in this example, meaning no reward for anybody. Two of them are losses and are assigned rewards of -1.
Now, you go back through the tree to the predecessors of the bottommost states. Since you made the move, you maximize your reward. For example, the maximum reward for the yellow node's predecessor is 0, so that node is assigned 0, too.
The next time, it's your opponent's turn. Stepping up one node again, that node has two predecessors, one with 0 and one with -1. Your opponent minimizes your reward, so the node is assigned the minimum of possible child nodes, -1.
We finally reached the original turn where you have three choices, one leading to a draw and two to a loss. You maximize your own reward and choose the option that leads to a draw, which was the final decision made by the min-max algorithm. At this time you know that the game will be a draw if the opponent has the same thoughts as you. It's tic tac toe after all!
that's the basic thought behind min-max, but i guess you see how this is not applicable to Magic. Check back for the details!
Subscribe to:
Posts (Atom)