value with NN 𝑄 𝑠, 𝑎 ≈ 𝑄) 𝑠, 𝑎 object function 𝔼 #,%,&,#! ~ℬ 𝑟 + 𝛾 max %! 𝑄) 𝑠*, 𝑎* − 𝑄) 𝑠, 𝑎 + DDPG Approximate value in case of the continuous action If 𝑎 is continuous, this is difficult Actor: 𝜇* 𝑠 Critic: 𝑄) 𝑠, 𝑎 object function 𝔼 #,%,&,#! ~ℬ 𝑟 + 𝛾𝑄) 𝑠*, 𝜇, 𝑠′ − 𝑄) 𝑠, 𝑎 + 𝔼#~ℬ −𝑄) 𝑠, 𝜇, 𝑠 for 𝜙 for 𝜃