Next: Computational Complexity
Up: Finite Horizon
Previous: Summary
An Algorithm for Constructing an Optimal Policy
In this section we develop an optimal Markovian deterministic policy. As shown in the previous section,
by this we achieve a general optimal policy. The algorithm construction is done from t=n to
t=1 in a recursive manner.
The algorithm:
- 1.
- For
,
- 2.
- For
and the optimal group of actions is:
The generated policy, ,
satisfying
is an optimal policy.
Yishay Mansour
1999-11-18