next up previous
Next: Q-learning Up: No Title Previous: conclusion:

   
Q-learning and SARSA algorithms

In this section we descuss off-line and on-line algorithms to compute
the optimal policy in case the exact model is not known.

 

Yishay Mansour
2000-01-07