2Direction: Theoretically Faster Distributed Training with Bidirectional Communication Compression

Alexander Tyurin, Peter Richtárik

Research output: Contribution to conferencePaperpeer-review

1 Scopus citations

Abstract

We consider distributed convex optimization problems in the regime when the communication between the server and the workers is expensive in both uplink and downlink directions. We develop a new and provably accelerated method, which we call 2Direction, based on fast bidirectional compressed communication and a new bespoke error-feedback mechanism which may be of independent interest. Indeed, we find that the EF and EF21-P mechanisms (Seide et al., 2014; Gruntkowska et al., 2023) that have considerable success in the design of efficient non-accelerated methods are not appropriate for accelerated methods. In particular, we prove that 2Direction improves the previous state-of-the-art communication complexity (Equation presented) (Gruntkowska et al., 2023) to (Equation presented) in the µ-strongly-convex setting, where L and Lmax are smoothness constants, n is # of workers, ω and α are compression errors of the RandK and TopK sparsifiers (as examples), K is # of coordinates/bits that the server and workers send to each other. Moreover, our method is the first that improves upon the communication complexity of the vanilla accelerated gradient descent (AGD) method (Nesterov, 2018). We obtain similar improvements in the general convex regime as well. Finally, our theoretical findings are corroborated by experimental evidence.

Original languageEnglish (US)
StatePublished - 2023
Event37th Conference on Neural Information Processing Systems, NeurIPS 2023 - New Orleans, United States
Duration: Dec 10 2023Dec 16 2023

Conference

Conference37th Conference on Neural Information Processing Systems, NeurIPS 2023
Country/TerritoryUnited States
CityNew Orleans
Period12/10/2312/16/23

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Fingerprint

Dive into the research topics of '2Direction: Theoretically Faster Distributed Training with Bidirectional Communication Compression'. Together they form a unique fingerprint.

Cite this