An Introduction to Game Tree Algorithms

by Hamed Ahmadi (www.hamedahmadi.com)

Contents

Preface
A number of people have asked me about the algorithms I used in my chess and reversi programs. What you see here has been put together in an attempt to answer these questions. Please feel free to
contact me if you still have questions after reading this!

Please remember that the methods presented here serve only as an introduction, and are not the only methods. On the contrary, the alpha-beta pruning described here can be improved, and a number of other algorithms also exist.

Introduction
The games we will be talking about are two-player, zero sum games. Zero sum games are games such as chess, checkers, and reversi; in which the total amount of "payoff" is constant. In other words, any game situation is as good for player A as it is bad for player B, and vice versa. It also generally helps if the game is finite! A finite game is a game in which every possible combination of moves leads to an end of the game. That is, it is impossible for the game to go on forever!

I will be assuming that the game is a board game, and that the two players are blue and red, with blue moving first. The general approach to writing a program to play such a game is to look ahead a few moves, and analyze different possibilities. Thus we have two major concerns; devising an analysis function, and searching through different positions.

The Analysis Function
The analysis function A(p) assigns a number to the board position p. Higher values of A(p) indicate that the position favors blue, while lower values mean that red is winning. Hence, for a position p in which blue has won, A(p) is , while A(p) is - if red has won. A draw position can be given the value 0.

The analysis function is a heuristic; in actuality it is very difficult to find a precise value. However, that doesn't stop us from trying! A good analysis function takes into effect many different aspects of the board position, such as the number of pieces, control of the board, control of certain key squares on the board, and also mobility. Mobility indicates how freely each player can move his/her pieces, and how many moves she/he has.

Much of the work in writing a game program involves writing a good analysis function. Usually, after you implement a good search procedure, the best way to improve your program's gameplay is to improve your analysis function! This stage is so important that often complicated methods, like genetic algorithms, are used to tune the analysis function. However, I recommend that you not spend too much time on analysis until you write the actual search procedure.

The Game Tree
The first algorithm that comes to mind is to always make the move which yields the highest analysis value. However, this will not work very well in practice, because as mentioned before, the analysis is only a guess as to how good the position is. Consider for example, an analysis function for the game of chess which uses only the material value (sum of piece values) for each player. A "greedy" algorithm which always chooses the move that maximizes the analysis might easily capture a pawn with its queen, ignoring the fact that the queen will be captured in the next move!

Therefore, we need a way to "look forward" a few moves. This is accomplished by searching in a game tree, in which the board positions are organized. In a game tree, every node corresponds to a board position. The children of each node N are the different positions that result from a single move in N, made by the player whose turn it is at N. The root of the tree is the starting position, or the position from which we want to find a move. The leaves (nodes which have no children) in the game tree correspond to terminating positions; positions in which the game has ended.

A portion of the game tree for "Tic-tac-toe" is shown in Figure 1. Note that each level in the tree pertains to the moves made by one particular player.

The MiniMax Algorithm
The minimax algorithm is used to find accurate values for the board positions. We assume that each player always plays his/her best move in any given position. With this assumption, two observations lead to the minimax algorithm: We have accurate analyses for leaves, and the value of a node can be determined accurately from its chilren's values. The value for leaves is easily determined, because in these positions the game has ended.

Each node that is not a leaf corresponds to a decision made by the player whose turn it is at the node's level. The player making a move at the node must decide which of that node's children it should choose as the next board position. It is clear that if the player is blue, he should choose the child with the largest value. Likewise, red should choose the position which minimizes her node's value. In this manner, we can determine the value for each node, and in doing so we find the best move possible from each node. Figure 2 illustrates this method.

For a small game like Tic-tac-toe, we can build and search the entire tree, all the way down to the actual leaves. However, this is clearly impossible for a more complicated game like reversi or chess. Here is where we finally use the analysis function. Since we can't search the entire tree, we only consider the first few levels. In this manner we pretend that the nodes below a certain level are leaves, and determine their value using the analysis function.

We can implement the procedure recursively, using two functions, one for red and one for blue.

int BlueValue(Board b, int depth) {
    if ((GameOver(b) or depth>MaxDepth)
        return Analysis(b)
    int max = -infinity
    for each legal move m in board b {
        copy b to c
        make move m in board c
        int x = RedValue(c, depth+1)
        if (x>max) max = x
    }
    return max
}
int RedValue(Board b, int depth) {
    if ((GameOver(b) or depth>MaxDepth)
        return Analysis(b)
    int min = infinity
    for each legal move m in board b {
        copy b to c
        make move m in board c
        int x = BlueValue(c, depth+1)
        if (x<min) min = x
    }
    return min
}

Where MaxDepth is a constant defining the maximum depth. Each of these functions is initially called with depth=0. The value returned is the value of the node corresponding to board b. Now, in order to find the best move from any position, we just need to choose the child with the best (minimum or maximum depending on color) value.

The NegaMax Algorithm
NegaMax isn't really a different algorithm; it's just a more elegant implementation. The problem with the above algorithm is that we have two different functions that are essentially doing the exact same thing. Negamax merges these two into one, by always considering the children of the node N, from N's point of view. That is, we multiply values in alternate rows by -1, so that both players become maximizers. This concept is illustrated in Figure 3, which essentially depicts the same tree as Figure 2. The resulting values are as they would be if we were to evaluate node values using the minimax algorithm above, and then multiply blue rows by -1. Note that it is not necessary to do this for the root of the tree.

One way to implement negamax is shown below.

 1  const int sign[2]={1,-1}   //0 is blue, 1 is red
 2  
 3  int NegaMax(Board b, int depth, int color) {
 4      if (GameOver(b) or depth>MaxDepth)
 5          return sign[color]*Analysis(b)
 6      int max = -infinity
 7      for each legal move m in board b {
 8          copy b to c
 9          make move m in board c
10          int x= - NegaMax(c, depth+1, 1-color)  //Note the "-" before "NegaMax"
11          if (x>max) max = x
12      }
13      return max
14  }
The function NegaMax returns the value of the node b, as seen by the player indicated by color. That is, the better the position is for color, the greater the value that it returns. Note that this is exactly the opposite of what we see in Figure 3. (In Figure 3, nodes were considered from their parent's point of view.) However, this is fixed in line 10, where the value of the child is reversed (multiplied by -1). In this manner, node values are processed as in Figure 3, with the exception of the root. Reversing the root, as mentioned before, is not important. In fact, it can be better if we don't reverse it. This way, the true value (the value it would have had in minimax) of a node b with color c, is returned by NegaMax(b,0,c).

Note also that we multiply the analysis value of leaves by sign[color] in line 5. We need to be careful not to leave out things like that, or make mistakes about them. A common mistake in negamax is to get the positive and negative mixed up. It is so easy to make a mistake and end up with a program that always finds the worst possible move for each player. Even worse, you might end up with a program that always finds the best move for one color, but the worse move for the other color.

Alpha-Beta Pruning and Branching
So now we have a search algorithm. How fast do you think it will be? How many levels can you search in a reasonable time? Considering the fact that most games can easily have more than 10-20 possible moves from every position, the number of nodes we have to search is (dramatically) exponential in the number of levels. In a game like chess, where we can sometimes have over 50 or even 100 possible moves in a position, this problem is even more drastic. Using what we've covered so far, we can only search through 3-4 levels within a reasonable time.

Thus we need a way to speed up our search procedure. How? Consider how a human would play a game. She would say, "OK, if I move my queen there, it would be captured immediately by my opponent's knight. So I won't move my queen there." But what would our program do in the same situation? It would consider moving the queen to where it could easily be captured, and then waste time search through thousands of nodes which result from moving pawns on the other side of the board! Eventually it would find the move in which the knight captures the queen, but it didn't need to take so long.

Alpha-Beta pruning uses the above ideas to drastically reduce the number of nodes we search. Pruning is the elimination of nodes found to be unnecessary to process while searching. Consider the tree in Figure 4. Note that for simplicity, this tree has not been "negamaxed".

Let's follow our algorithm, assuming that nodes are processed from left to right. While trying to find the value of A, we go down to B and determine its value to be 6, the minimum of 6 and 9. Then control is passed to C. From there we compute E to be 4, the maximum of 4 and -2. Since C is a minimizer, we now know that the value of C is less than or equal to 4. Therefore, C will not be chosen by A as the maximum, since A already has the child B with value 6, which will be greater than any value C may later have. Hence, searching F and G is useless, and we can immediately stop processing (prune) C. Control then passes to D.

The same thing can happen if A is a minimizer. In general, we can define lower and upper bounds for the value of a node, called alpha and beta, respectively. One of these bounds (depending on the player's color) does not change within the node, and if the node's value crosses this bound, we can prune the node. The other bound, however, changes as the node's value is updated.

This can be implemented using minimax as below.

int BlueValue(Board b, int depth, int alpha, int beta) {
    if ((GameOver(b) or depth>MaxDepth)
        return Analysis(b)
    int max = -infinity
    for each legal move m in board b {
        copy b to c
        make move m in board c
        int x = RedValue(c, depth+1, alpha, beta)
        if (x>max) max = x
        if (x>alpha) alpha = x
        if (alpha>=beta) return alpha
    }
    return max
}
int RedValue(Board b, int depth, int alpha, int beta) {
    if ((GameOver(b) or depth>MaxDepth)
        return Analysis(b)
    int min = infinity
    for each legal move m in board b {
        copy b to c
        make move m in board c
        int x = BlueValue(c, depth+1, alpha, beta)
        if (x<min) min = x
        if (x<beta) beta = x
        if (alpha>=beta) return beta
    }
    return min
}

Note that we can write alpha>=beta instead of alpha>beta, which slightly increases our program's speed. These functions are intially called with depth=0, alpha=-, and beta=.

The negamax version can be written as below.

const int sign[2]={1,-1}   //0 is blue, 1 is red

int NegaMax(Board b, int depth, int color, int alpha, int beta) {
    if ((GameOver(b) or depth>MaxDepth)
        return sign[color]*Analysis(b)
    int max = -infinity
    for each legal move m in board b {
        copy b to c
        make move m in board c
        int x= - NegaMax(c, depth+1, 1-color, -beta, -alpha)
        if (x>max) max = x
        if (x>alpha) alpha = x
        if (alpha>=beta) return alpha
    }
    return max
}
Note that alpha and beta are replaced with -beta and -alpha, respectively, when NegaMax is called within itself.

Alpha-Beta pruning, used properly, will usually allow us to search at least 2-3 extra levels. However, pruning alone is not enough! Let's look back for a moment at Figure 4. Why was C pruned? C was pruned because B, which was searched before C, had a value better than any value C could possibly have. We knew that C will not be chosen, because we'd already found a better move in A. Thus, in order to maximize the number of nodes that get pruned, we try to check the moves that "look better" first. That is, we process the children of a node in the order of how good we expect the move to be. This is called branching. Here, we use another heuristic to try to compare moves. Some of the things we could use are:

The best way to find which method to use is to experiment! See which works best for your program. After we find the move values, we can sort them in decreasing order, and process them in that order. However, a heap data structure may prove to be more efficient than sorting, because when pruning takes place, we leave the function and thus don't need to have the rest of the moves sorted.

Another method sometimes used to increase speed is to divide the number returned by the analysis function by a constant value. (For example divide it by 10.) This reduces the range of values that the nodes in the game tree can have, and can therefore result in more pruning. However, this makes the analysis less accurate, so it should be used with care.

The Horizon Effect
Remember why we decided to use "look-ahead" with game trees in the first place? The problem was that we couldn't detect dangerous situations, arising from us doing something bad that the analysis function couldn't immediately detect. So why shouldn't the same problem still exist when we're looking several moves ahead? We're still using the analysis function to guess the value of most nodes.

The fact is, the problem is still there, though it is less drastic, because we often have time to fix things before they get too bad. The problem is called the horizon effect, and happens when we use a constant maximum depth. In this situation, there is always a "horizon" over which we cannot see. At every move, the horizon moves back a bit, revealing something that we couldn't see before. How can we prevent being surprised by something bad suddenly appearing after the horizon moves back?

Usually such events occur when there is a node at our bottom level (at the horizon), in which a strong move exists, but is not considered since we are at our maximum depth. In order to avoid this situation, we allow our program to go beyond the maximum depth under certain circumstances, when the position is "hot". For example, good chess programs usually process capturing do a great depth. That is, we can extend the maximum depth for a node in which the capture of a piece other than a pawn exists, or if the king is in check.

References
This introduction has been written using my personal experience, bits of information I've picked up from here and there, and algorithms I learned from the following sources.

The internet is also an excellent place to get information about game algorithms.

Return to main page


Please free free to send all comments, feedback, etc. to me.
© 2002-2012 by Hamed Ahmadi. All rights reserved.