π (a|St+1) Qh−1 (St+1 , a) + γπ (At+1 |St+1) Gt+1:h = Rt+1 + γVh−1 (St+1) − γπ (At+1 |St+1) Qh−1 (St+1 , At+1) + γπ (At+1 |St+1) Gt+1:h = Rt+1 + γπ (At+1 |St+1) (Gt+1:h − Qh−1 (St+1 , At+1)) + γVh−1 (St+1) OTUFQ5SFF#BDLVQ $POUSPM7BSJBUF4BSTBͷЛΛК " 4 ʹม͑ͨͷ Gt:h ≐ Rt+1 + γ (σt+1 ρt+1 + (1 − σt+1) π (At+1 |St+1)) (Gt+1:h − Qh−1 (St+1 , At+1)) +γVh−1 (St+1) OTUFQ2 М $POUSPM7BSJBUF4BSTBͱOTUFQ5SFF#BDLVQΛМͰεΠον͍ͯ͠Δ