Slide 26
Slide 26 text
INTUITION
• Introduce 3 matrices:
• In this notation, LSTD is:
• Bellman equation is:
= ( (
x1)
, ...,
(
xn
))T
,
e = ( (
x2)
, ...,
(
xn
+1))T
,
= (E
x
0⇠
p
(·|
x1,⇡
(
x1))
[ (
x
0)]
, ...,
E
x
0⇠
p
(·|
x1,⇡
(
x1))
[ (
x
0)])T
.
R = ( )w .
e
w = ( T( e )) 1 TR .