TY - GEN
T1 - On differentially private stochastic convex optimization with heavy-Tailed data
AU - Wang, Di
AU - Xiao, Hanshen
AU - Devadas, Srini
AU - Xu, Jinhui
N1 - KAUST Repository Item: Exported on 2021-05-25
Acknowledgements: Di Wang and Jinhui Xu were supported in part by the National Science Foundation (NSF) under Grant No. CCF1716400 and IIS-1919492.
PY - 2020/1/1
Y1 - 2020/1/1
N2 - In this paper, we consider the problem of de-signing Differentially Private (DP) algorithms for Stochastic Convex Optimization (SCO) on heavy-Tailed data. The irregularity of such data violates some key assumptions used in almost all existing DP-SCO and DP-ERM methods, re-sulting in failure to provide the DP guarantees. To better understand this type of challenges, we provide in this paper a comprehensive study of DP-SCO under various settings. First, we con-sider the case where the loss function is strongly convex and smooth. For this case, we propose a method based on the sample-And-Aggregate framework, which has an excess population risk of O( d3 n4 ) (after omitting other factors), where n is the sample size and d is the dimensional-ity of the data. Then, we show that with some additional assumptions on the loss functions, it is possible to reduce the expected excess popula-tion risk to O ( d2 n2 ). To lift these additional condi-tions, we also provide a gradient smoothing and trimming based scheme to achieve excess popula-tion risks of O ( d2 n2 ) and O( d 2 3 (n2) 13 ) for strongly convex and general convex loss functions, respec-tively, with high probability. Experiments sug-gest that our algorithms can effectively deal with the challenges caused by data irregularity.
AB - In this paper, we consider the problem of de-signing Differentially Private (DP) algorithms for Stochastic Convex Optimization (SCO) on heavy-Tailed data. The irregularity of such data violates some key assumptions used in almost all existing DP-SCO and DP-ERM methods, re-sulting in failure to provide the DP guarantees. To better understand this type of challenges, we provide in this paper a comprehensive study of DP-SCO under various settings. First, we con-sider the case where the loss function is strongly convex and smooth. For this case, we propose a method based on the sample-And-Aggregate framework, which has an excess population risk of O( d3 n4 ) (after omitting other factors), where n is the sample size and d is the dimensional-ity of the data. Then, we show that with some additional assumptions on the loss functions, it is possible to reduce the expected excess popula-tion risk to O ( d2 n2 ). To lift these additional condi-tions, we also provide a gradient smoothing and trimming based scheme to achieve excess popula-tion risks of O ( d2 n2 ) and O( d 2 3 (n2) 13 ) for strongly convex and general convex loss functions, respec-tively, with high probability. Experiments sug-gest that our algorithms can effectively deal with the challenges caused by data irregularity.
UR - http://hdl.handle.net/10754/669227
UR - http://www.scopus.com/inward/record.url?scp=85105337629&partnerID=8YFLogxK
M3 - Conference contribution
SN - 9781713821120
SP - 10023
EP - 10033
BT - 37th International Conference on Machine Learning, ICML 2020
PB - International Machine Learning Society (IMLS)
ER -