Presentation of a paper by Jacques et al. (2018) at the Mila Learning Agents RG: https://arxiv.org/abs/1810.08647
An important feature of human decision-making is our ability to predict each others’ behavior. Through social interaction, we learn a model of others’ internal states, which helps us to anticipate future actions, plan and collaborate. Recent deep learning models have been compared to idiot savants - capable of performing highly specialized tasks but lacking what social psychology calls a “theory of mind”. In this research, Jaques et al. study the conditions for a theory of mind to emerge in multi-agent RL, and discover an interesting connection to causal inference.
The authors began by exploring a novel reward structure based on “social influence”, observing a rudimentary form of communication emerged between agents. Then, by providing an explicit communication channel, they observed agents could achieve better collective outcomes. Finally, using tools from causal inference, they endowed each agent with a model of other agents (MOA) network, allowing them to predict others’ actions without direct access to the counterpart’s reward function. In doing so, agents exhibited intrinsic motivation and the researchers were able to remove the external reward mechanism altogether.
In this talk, we will discuss a few important ideas from Causal Inference, such as counterfactual reasoning, the MOA framework and the use of mutual information as a mechanism for designing social rewards. No prior background in causal modeling is required or expected.