next up previous
Next: Markovian Policy Up: No Title Previous: No Title

   
Finite Horizon

In lecture number 3 we introduced the optimality equations:

\begin{displaymath}U_t(h_t) = \max_{a\in A} \{r_t(s_t,a) + \sum_{j\in S} P_t(j\vert s_t,a)U_{t+1}(h_t,a,j)\},\end{displaymath}

Where UN(hN)=rN(sN) for hN=(hN-1,aN-1,sN).
We showed:
1.
An optimal policy
2.
A deterministic optimal policy


 

Yishay Mansour
1999-11-18