Workshop in Reinforcement Learning

(0368-3500-13)

More Challenging Projects

1. Nir Nahum and Eran Gewurtz: Four in a Row.

2. Uri Ravzin, Yuval Steinberg and Zion Zatlawi: Four in a Row.

3. Yael Kagan, Max Shifrin and Mark Sandler: goMoku (Generalized X-O)

4. Noam Kovacs, Oren Solomianik and Gennady Verdel: Breakthrough

5. Negev Nosatzki, Moran Shekel and Oshrit Feder: Backgammon

6. Daniel Rosenblatt and Benny Trachtenbrot: Mancala

A.G. Barto and R.S., Reinforcement Learning, MIT Press, 1998.
Bertsekas, D. P. and Tsitsiklis, J. N. (1996). Neural Dynamic Programming. Athena Scientific, Belmont, MA.
Gardner (1981). Samuel's checkers player. In Barr, A. and Feigenbaum, E. A., editors, The Handbook of Artificial Intelligence, I, pages 84--108. William Kaufmann, Los Altos, CA.
Samuel, A. L. (1967). Some studies in machine learning using the game of checkers. II---Recent progress. IBM Journal on Research and Development, pages 601--617.
Tesauro, G. J. (1994). TD--gammon, a self-teaching backgammon program, achieves master-level play. Neural Computation, 6(2):215--219.
Tesauro, G. J. (1995). Temporal difference learning and TD-Gammon. Communications of the ACM, 38:58--68.
Tsitsiklis, J. N. and Van Roy, B. (1996). Feature-based methods for large scale dynamic programming. Machine Learning, 22:59--94.