Next: Policy Iteration
Up: Finding the Optimal Policy:
Previous: Convergence of Value Iteration
Example: Running Value Iteration Algorithm
(Consider the MDP in figure
)
Let:
Step 1: We Initialize:
Steps 2-5:
Step 6:
Note that the iterated value approaches
,
which is:
Yishay Mansour
1999-12-18