A delay-tolerant proximal-gradient algorithm for distributed learning

Konstantin Mishchenko, Franck Iutzeler, Jérôme Malick, Massih Reza Amini

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Scopus citations

Abstract

Distributed learning aims at computing high- quality models by training over scattered data. This covers a diversity of scenarios, including computer clusters or mobile agents. One of the main challenges is then to deal with heterogeneous machines and unreliable communications. In this setting, we propose and analyze a flexible asynchronous optimization algorithm for solving nonsmooth learning problems. Unlike most existing methods, our algorithm is adjustable to various levels of communication costs, machines computational powers, and data distribution evenness. We prove that the algorithm converges linearly with a fixed learning rate that does not depend on communication delays nor on the number of machines. Although long delays in communication may slow down performance, no delay can break convergence.
Original languageEnglish (US)
Title of host publication35th International Conference on Machine Learning, ICML 2018
PublisherInternational Machine Learning Society (IMLS)[email protected]
Pages5774-5788
Number of pages15
ISBN (Print)9781510867963
StatePublished - Jan 1 2018

Fingerprint

Dive into the research topics of 'A delay-tolerant proximal-gradient algorithm for distributed learning'. Together they form a unique fingerprint.

Cite this