Mordvintsev2 Andrey Zhmoginov2 Max Vladymyrov2 1. ETH Zurich, 2. Google Research 杉浦孔明研究室 小槻誠太郎 ICML23 OralPoster J. von Oswald, E. Niklasson, E. Randazzo, J. Sacramento, A. Mordvintsev, A. Zhmoginov, and M. Vladymyrov, “Transformers Learn In-Context by Gradient Descent,” in ICML, 2023, pp. 35151–35174.