Next:
Introduction: Discounted Infinite Horizon
Introduction: Discounted Infinite Horizon
Notation
Probabilities Transition Matrix
Expectation of Reward
The discounted value of policy
Assumptions
Calculating the Return Value of a Given Policy
Existence of a unique solution
Example:
Properties of the transition matrix:
Computing the Optimal Policy
Optimality Equations
The Solution of the Optimality Equations:
Existence of a limit
is a fixed point
Uniqueness of
Example:
About this document ...
Yishay Mansour
1999-11-24