Text Books:
Matrkov Decisoin Processes Martin
L. Puterman
Reinforcement
Learning Richard
S. Sutton and Andrew G. Barto
First Class:
Handouts:
1. course description (postscript,html)
2. template for scribe notes
(postscript, latex,
html)
and explanation
about latex (postscript,
latex,
html)
3. Slides of first class
(power point,postscript,html).
Second Class:
Finished the overview.
Third and Fourth Class:
Model of Markov decision Processes
(MDP) and
Finite Horizon Problems.
Lecture 3 (postscript,
latex,
html)
Lecture 4 (postscript,
latex,
html)
Homework 1 (postscript, latex, html)
Fifth and Six Class:
Infinite Horizon Discounted Problems.
Lecture 5 (postscript,
latex,
html)
Lecture 6 (postscript,
latex,
html)
Homework 2 (postscript, latex, html)
Seven, Eight and Nine Class:
Learning with unknown model.
Lecture 7 (postscript,
latex,
html)
Monte-carlo Algorithms
Lecture 8 (postscript,
latex,
html)
Temporal Diffrence (TD) Algorithms
Lecture 9 (postscript,
latex,
html)
Q-Learning (and SARSA) Algorithms
Homework 3 (postscript, latex, html)
Lecture Ten and Eleven:
Learning with large state space.
Lecture 10 (postscript, latex, html) TD-Gammon
Lecture 11 (postscript, latex, html) Large state space
Lecture Twelve:
Partially Observable MDP.
Lecture Thirteen:
Generator model and sparse sampling
in Large MDPs.
PROJECT(postscript,
latex,
html)