TY - GEN
T1 - Fast Server Learning Rate Tuning for Coded Federated Dropout
AU - Verardo, Giacomo
AU - Barreira, Daniel
AU - Chiesa, Marco
AU - Kostic, Dejan
AU - Maguire, Gerald Q.
N1 - KAUST Repository Item: Exported on 2023-05-24
Acknowledged KAUST grant number(s): ORA-CRG2021-4699
Acknowledgements: This research work was conducted with funding awarded by the Swedish Research Council for the project “Scalable Federated Learning” with registration number 2021-04610. This publication is based upon work supported by the King Abdullah University of Science and Technology (KAUST) Office of Research Administration (ORA) under Award No. ORA-CRG2021-4699.
This publication acknowledges KAUST support, but has no KAUST affiliated authors.
PY - 2023/3/29
Y1 - 2023/3/29
N2 - In Federated Learning (FL), clients with low computational power train a common machine model by exchanging parameters via updates instead of transmitting potentially private data. Federated Dropout (FD) is a technique that improves the communication efficiency of a FL session by selecting a subset of model parameters to be updated in each training round. However, compared to standard FL, FD produces considerably lower accuracy and faces a longer convergence time. In this chapter, we leverage coding theory to enhance FD by allowing different sub-models to be used at each client. We also show that by carefully tuning the server learning rate hyper-parameter, we can achieve higher training speed while also reaching up to the same final accuracy as the no dropout case. Evaluations on the EMNIST dataset show that our mechanism achieves 99.6% of the final accuracy of the no dropout case while requiring 2.43 × less bandwidth to achieve this level of accuracy.
AB - In Federated Learning (FL), clients with low computational power train a common machine model by exchanging parameters via updates instead of transmitting potentially private data. Federated Dropout (FD) is a technique that improves the communication efficiency of a FL session by selecting a subset of model parameters to be updated in each training round. However, compared to standard FL, FD produces considerably lower accuracy and faces a longer convergence time. In this chapter, we leverage coding theory to enhance FD by allowing different sub-models to be used at each client. We also show that by carefully tuning the server learning rate hyper-parameter, we can achieve higher training speed while also reaching up to the same final accuracy as the no dropout case. Evaluations on the EMNIST dataset show that our mechanism achieves 99.6% of the final accuracy of the no dropout case while requiring 2.43 × less bandwidth to achieve this level of accuracy.
UR - http://hdl.handle.net/10754/691974
UR - https://link.springer.com/10.1007/978-3-031-28996-5_7
UR - http://www.scopus.com/inward/record.url?scp=85152560522&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-28996-5_7
DO - 10.1007/978-3-031-28996-5_7
M3 - Conference contribution
SN - 9783031289958
SP - 84
EP - 99
BT - Trustworthy Federated Learning
PB - Springer International Publishing
ER -