TY - GEN
T1 - On PAC Learning Halfspaces in Non-interactive Local Privacy Model with Public Unlabeled Data
AU - Su, Jinyan
AU - Xu, Jinhui
AU - Wang, Di
N1 - KAUST Repository Item: Exported on 2023-07-04
Acknowledged KAUST grant number(s): BAS/1/1689-01-01, FCC/1/1976-49-01, REI/1/4811-10-01, URF/1/4663-01-01
Acknowledgements: Di Wang was support in part by the baseline funding BAS/1/1689-01-01, funding from the CRG grand URF/1/4663-01-01, FCC/1/1976-49-01 from CBRC and funding from the AI Initiative REI/1/4811-10-01 of King Abdullah University of Science and Technology (KAUST). He was also supported by the funding of the SDAIA-KAUST Center of Excellence in Data Science and Artificial Intelligence (SDAIA-KAUST AI). Part of the work was done when Jinyan Su was a research intern at KAUST.
PY - 2022/1/1
Y1 - 2022/1/1
N2 - In this paper, we study the problem of PAC learning halfspaces in the non-interactive local differential privacy model (NLDP). To breach the barrier of exponential sample complexity, previous results studied a relaxed setting where the server has access to some additional public but unlabeled data. We continue in this direction. Specifically, we consider the problem under the standard setting instead of the large margin setting studied before. Under different mild assumptions on the underlying data distribution, we propose two approaches that are based on the Massart noise model and self-supervised learning and show that it is possible to achieve sample complexities that are only linear in the dimension and polynomial in other terms for both private and public data, which significantly improve the previous results. Our methods could also be used for other private PAC learning problems.
AB - In this paper, we study the problem of PAC learning halfspaces in the non-interactive local differential privacy model (NLDP). To breach the barrier of exponential sample complexity, previous results studied a relaxed setting where the server has access to some additional public but unlabeled data. We continue in this direction. Specifically, we consider the problem under the standard setting instead of the large margin setting studied before. Under different mild assumptions on the underlying data distribution, we propose two approaches that are based on the Massart noise model and self-supervised learning and show that it is possible to achieve sample complexities that are only linear in the dimension and polynomial in other terms for both private and public data, which significantly improve the previous results. Our methods could also be used for other private PAC learning problems.
UR - http://hdl.handle.net/10754/681690
UR - https://proceedings.mlr.press/v189/su23a.html
UR - http://www.scopus.com/inward/record.url?scp=85162220162&partnerID=8YFLogxK
M3 - Conference contribution
SP - 927
EP - 941
BT - 14th Asian Conference on Machine Learning, ACML 2022
PB - ML Research Press
ER -