Next:
Example:
Up:
Calculating the Return Value
Previous:
Calculating the Return Value
Existence of a unique solution
We define a linear transformation
L
d
:
.
Since
,
is a fixed point of
L
d
.
Theorem 5.2
For
and
a Markovian Stationary policy,
is the unique solution for the equation set
and is equal to
Proof:
We can write the equation set as
Since
P
d
is a probability matrix,
, and as
,
.
According to Theorem
,
exists. Thus, a solution
exists.
By the same theorem,
We have shown that the solution is the discounted return value of policy
Figure:
Example Diagram
Yishay Mansour
1999-11-24