TY - JOUR
T1 - Randomized Distributed Mean Estimation: Accuracy vs. Communication
AU - Konečný, Jakub
AU - Richtarik, Peter
N1 - KAUST Repository Item: Exported on 2021-03-02
Acknowledgements: JK acknowledges support from Google via a Google European Doctoral Fellowship. Work done while at University of Edinburgh, currently at Google. PR acknowledges support from Amazon, and the EPSRC Grant EP/K02325X/1, Accelerated Coordinate Descent Methods for Big Data Optimization and EPSRC Fellowship EP/N005538/1, Randomized Algorithms for Extreme Convex Optimization.
PY - 2018/12/18
Y1 - 2018/12/18
N2 - We consider the problem of estimating the arithmetic average of a finite collection of real vectors stored in a distributed fashion across several compute nodes subject to a communication budget constraint. Our analysis does not rely on any statistical assumptions about the source of the vectors. This problem arises as a subproblem in many applications, including reduce-all operations within algorithms for distributed and federated optimization and learning. We propose a flexible family of randomized algorithms exploring the trade-off between expected communication cost and estimation error. Our family contains the full-communication and zero-error method on one extreme, and an ϵ-bit communication and (Formula presented.) error method on the opposite extreme. In the special case where we communicate, in expectation, a single bit per coordinate of each vector, we improve upon existing results by obtaining (Formula presented.) error, where r is the number of bits used to represent a floating point value.
AB - We consider the problem of estimating the arithmetic average of a finite collection of real vectors stored in a distributed fashion across several compute nodes subject to a communication budget constraint. Our analysis does not rely on any statistical assumptions about the source of the vectors. This problem arises as a subproblem in many applications, including reduce-all operations within algorithms for distributed and federated optimization and learning. We propose a flexible family of randomized algorithms exploring the trade-off between expected communication cost and estimation error. Our family contains the full-communication and zero-error method on one extreme, and an ϵ-bit communication and (Formula presented.) error method on the opposite extreme. In the special case where we communicate, in expectation, a single bit per coordinate of each vector, we improve upon existing results by obtaining (Formula presented.) error, where r is the number of bits used to represent a floating point value.
UR - http://hdl.handle.net/10754/667779
UR - https://www.frontiersin.org/article/10.3389/fams.2018.00062/full
UR - http://www.scopus.com/inward/record.url?scp=85079925934&partnerID=8YFLogxK
U2 - 10.3389/fams.2018.00062
DO - 10.3389/fams.2018.00062
M3 - Article
SN - 2297-4687
VL - 4
JO - Frontiers in Applied Mathematics and Statistics
JF - Frontiers in Applied Mathematics and Statistics
ER -