Slide 22
Slide 22 text
参考文献
● Duchi, John and Hazan, Elad and Singer, Yoram / Adaptive Subgradient Methods for Online Learning
and Stochastic Optimization. https://dl.acm.org/citation.cfm?id=2021068
adagrad の論文。結構最適化よりなのでむずい。
● Sebastian Ruder. An overview of gradient descent optimization algorithms.
https://arxiv.org/abs/1609.04747
勾配法とか確率的勾配法のサーベイ。わかりよい。
● Liangchen Luo, Yuanhao Xiong, Yan Liu, Xu Sun. Adaptive Gradient Methods with Dynamic Bound of
Learning Rate. https://openreview.net/forum?id=Bkg3g2R9FX
adabound 論文
22