Homework number 2
1.
Symmetric game
A symmetric
two player game has u1(i,j)= u2(j,i).
Show that every symmetric game has a symmetric mixed Nash equilibrium. (In a
symmetric equilibrium both players have the same mixed action.)
2.
External Regret: Show that a
player playing a regret minimization algorithm guarantees to himself an average
payoff of at least the minimax value minus ε
where ε>0 can be arbitrarily small (as a function of T). Namely, the
value it can guarantee in a zero-sum game against an adversary, with respect to
its utility function. [HINT: show that an action sequence has an average payoff
less than the minimax value, then
it has an external regret.]
3.
Regret:
Show an example where the swap regret
is unbounded with respect to the external regret. (There is an example that
uses only 3 actions, has zero external regret and swap regret linear in T.)
Clarification: You need to choose only a loss sequence and actions
(no adversary or algorithm). Given the loss sequence and the selected actions,
the external regret is the difference between the loss of the action sequence
and that of the loss of the best single action. The swap regret is the
difference between the loss of the action sequence and the loss of the best
function f of swapping actions.
4.
Polynomial weights : Consider the following weight update: w(t+1,i)=w(t,i)(1-η loss(t,i)),
where loss(t,i) is the loss of action i at time t, and initially w(1,i)=1.(
Assume that the losses are in [0,1])
Show a regret
bound of ηQ + ln(m)/η,
where Q bound the sum of square of losses of any action, i.e.,
Q > maxi
∑t loss2(t,i), and η<½.
Here are the
proposed steps. Let Wt be
the sum of the weights at time t. Fix an action i,
and let L(T,i)
be the loss of action i until time T.Show:
·
ln(WT+1
/W1) ≥ - ηL(T,i) – ln(m)- η2Q
o
use the
inequality ln(1-z) ≥ -z-z2
for z in [0,½].
·
ln(WT+1/W1)≤
-ηLON, where LON is the online loss
o use the inequality ln(1+z) ≤ z
The
homework is due May 17
Hand in
your proposal for a project by May 24