In real-world applications such as recommendation systems, an important aspect of LLM-driven agents is their ability to take actions that adapt to users' preferences. In this talk, we will briefly introduce the basic concepts of reinforcement learning (RL) and two widely used policy optimization algorithms, PPO and DPO. Finally, through a demonstration of a recommendation agent, we will show how RL can enable agents to provide more user-adaptive responses.