Next: Notations
Up: Reinforcement Learning - Final
Previous: Reinforcement Learning - Final
Introduction
In this project I've studied the article:
"Finite-Sample Convergence Rates for Q-learning and Indirect
Algorithms", by Michael Kearns and Satinder Singh. The article
discusses the amount of experience needed for achieving a policy
with a certain level of performance guarantee by the learning
algorithms: Phased-Q-Learning (which is a variant of the familiar
Q-Learning) and the indirect algorithm (both algorithms are
explained later on); and compares the behaviour of the direct
(Phased-Q-Learning) and indirect algorithms. The article presents
a theorem but does not prove it, so the heart of my project is
presenting a proof to the theorem.
Yishay Mansour
2000-05-30