TY - GEN
T1 - Optimal Gradient Sliding and its Application to Distributed Optimization Under Similarity
AU - Kovalev, Dmitry
AU - Beznosikov, Aleksandr
AU - Borodich, Ekaterina
AU - Gasnikov, Alexander
AU - Scutari, Gesualdo
N1 - KAUST Repository Item: Exported on 2023-07-10
Acknowledgements: The research in Sections 1–3 was supported by Russian Science Foundation (project No. 21-71-30005). The work in Section 4 was prepared within the framework of the HSE University Basic Reaearch Programm. The research of A. Scutari was partially supported by the ONR Grant N. N00014-21-1-2673.
PY - 2022/1/1
Y1 - 2022/1/1
N2 - We study structured convex optimization problems, with additive objective r := p + q, where r is (µ-strongly) convex, q is Lq-smooth and convex, and p is Lpsmooth, possibly nonconvex. For such a class of problems, we proposed an inexact accelerated gradient sliding method that can skip the gradient computation for one of these components while still achieving optimal complexity of gradient calls of p and q, that is, (Equation presented) and (Equation presented), respectively. This result is much sharper than the classic black-box complexity (Equation presented), especially when the difference between Lp and Lq is large. We then apply the proposed method to solve distributed optimization problems over master-worker architectures, under agents' function similarity, due to statistical data similarity or otherwise. The distributed algorithm achieves for the first time lower complexity bounds on both communication and local gradient calls, with the former having being a longstanding open problem. Finally the method is extended to distributed saddle-problems (under function similarity) by means of solving a class of variational inequalities, achieving lower communication and computation complexity bounds.
AB - We study structured convex optimization problems, with additive objective r := p + q, where r is (µ-strongly) convex, q is Lq-smooth and convex, and p is Lpsmooth, possibly nonconvex. For such a class of problems, we proposed an inexact accelerated gradient sliding method that can skip the gradient computation for one of these components while still achieving optimal complexity of gradient calls of p and q, that is, (Equation presented) and (Equation presented), respectively. This result is much sharper than the classic black-box complexity (Equation presented), especially when the difference between Lp and Lq is large. We then apply the proposed method to solve distributed optimization problems over master-worker architectures, under agents' function similarity, due to statistical data similarity or otherwise. The distributed algorithm achieves for the first time lower complexity bounds on both communication and local gradient calls, with the former having being a longstanding open problem. Finally the method is extended to distributed saddle-problems (under function similarity) by means of solving a class of variational inequalities, achieving lower communication and computation complexity bounds.
UR - http://hdl.handle.net/10754/678602
UR - https://proceedings.neurips.cc/paper_files/paper/2022/hash/d88f6f81e1aaf606776ffdd06fdf24ef-Abstract-Conference.html
UR - http://www.scopus.com/inward/record.url?scp=85163213733&partnerID=8YFLogxK
M3 - Conference contribution
SN - 9781713871088
BT - 36th Conference on Neural Information Processing Systems, NeurIPS 2022
PB - Neural information processing systems foundation
ER -