Algorithms - Value Iterations
V0(s) := 0
Vt+1(s) := MAXaeA { E[R(s,a)] + g SseS d(s,a,s’) Vt(s’) } "s
Let dt= MAXseS{|Vt (s) - V*(s)|}
Claim: dt £ g dt+1 o
Convergence: After t iterations we have error g t.
Previous slide
Next slide
Back to first slide
View graphic version