Slide 50
Slide 50 text
Z. Li and J. Li. A simple proximal stochastic gradient method for nonsmooth
nonconvex optimization. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman,
N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information
Processing Systems 31, pages 5564–5574. Curran Associates, Inc., 2018. URL
http://papers.nips.cc/paper/
7800-a-simple-proximal-stochastic-gradient-method-for-nonsmooth-n
pdf.
Y. Liu, F. Shang, and J. Cheng. Accelerated variance reduced stochastic admm. In
Thirty-First AAAI Conference on Artificial Intelligence, pages 2287–2293, 2017.
S. Minsker. On some extensions of Bernstein’s inequality for self-adjoint
operators. Statistics & Probability Letters, 127:111–119, 2017.
J. F. Mota, J. M. Xavier, P. M. Aguiar, and M. P¨
uschel. A proof of convergence
for the alternating direction method of multipliers applied to
polyhedral-constrained functions. arXiv preprint arXiv:1112.2295, 2011.
T. Murata and T. Suzuki. Gradient descent in rkhs with importance labeling,
2020.
L. M. Nguyen, J. Liu, K. Scheinberg, and M. Tak´
aˇ
c. SARAH: A novel method for
machine learning problems using stochastic recursive gradient. In D. Precup
and Y. W. Teh, editors, Proceedings of the 34th International Conference on
Machine Learning, volume 70 of Proceedings of Machine Learning Research,
pages 2613–2621, International Convention Centre, Sydney, Australia, 06–11
42 / 42