Next:
The discounted value of
Up:
Notation
Previous:
Probabilities Transition Matrix
Expectation of Reward
The expected value function after t steps, starting from state s, using policy
is:
Yishay Mansour
1999-11-24