Next: MDP description
Up: TD-Gammon
Previous: MDPs with a very
State encoding TD-Gammon
TD-Gammon uses a neural
network with 198 inputs. For each position and for each color
there are four inputs:
- 1.
- equals true if there is at least one piece present.
- 2.
- equals true if there are at least two pieces present.
- 3.
- equals true if there are at least three pieces present.
- 4.
- has a value of
if there are at least four pieces present.
If no piece is present then all four inputs are false.
Two
additional inputs encode the number of pieces that were "taken"
for each color. Each one has a value of
where n is
the number of eaten pieces. Two other inputs encode the number of
pieces removed. Each one has a value of
where n
is the number of pieces removed. Two last boolean inputs encode
for each player whether it is his turn currently.
Yishay Mansour
2000-01-17