Next: Computational Complexity
Up: Finite Horizon
Previous: Summary
An Algorithm for Constructing an Optimal Policy
In this section we develop an optimal Markovian deterministic policy. As shown in the previous section,
by this we achieve a general optimal policy. The algorithm construction is done from t=n to
t=1 in a recursive manner.
The algorithm:
- 1.
- For
,

- 2.
- For
and the optimal group of actions is:
The generated policy,
,
satisfying
is an optimal policy.
Yishay Mansour
1999-11-18