next up previous
Next: An Algorithm for Constructing Up: Markovian Policy Previous: Markovian Policy

   
Summary

Theorem 3.2 and theorem 4.1 lead to

\begin{displaymath}V_N^*(s) = \max_{ \pi \in {\Pi}^{HR}} \{V_N^{\pi}(s) = \max_{\pi \in {\Pi}^{MD}} \{V_N^{\pi}(s)\}\end{displaymath}

Namely, the optimal policy can always be chosen out of the group of Markovian deterministic policies.



Yishay Mansour
1999-11-18