Slide 16
Slide 16 text
Experiments
● Multiwoz (Budzianowski et al., 2018)
○ 7 domains: train, restaurant, taxi, ...
○ A user goal may involve multiple
domains
○ 8,438 dialogues
● Evaluation Metrics
○ Success rate
○ Inform F1
○ Match
○ #Turns
16
● Baselines
○ PPO (Schulman et al., 2017)
○ DQN (Mnih et al., 2015)
○ DDQ (DQN + unconstrained
diversification) (Peng et al., 2018)
○ GDPL (PPO + IRL, leading
performer on Multiwoz)
(Takanobu et al., 2019)
○ MADPL (MARL, leading
performer on Multiwoz)
(Takanobu et al., 2020)