TY - JOUR
T1 - Risk Convergence of Centered Kernel Ridge Regression with Large Dimensional Data
AU - Elkhalil, Khalil
AU - Kammoun, Abla
AU - Zhang, Xiangliang
AU - Alouini, Mohamed-Slim
AU - Al-Naffouri, Tareq Y.
N1 - KAUST Repository Item: Exported on 2021-12-15
Acknowledged KAUST grant number(s): OSR-CRG2019-4041
Acknowledgements: This work was supported by the King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research (OSR) under Award OSR-CRG2019-4041.
PY - 2020
Y1 - 2020
N2 - This paper carries out a large dimensional analysis of a variation of kernel ridge regression that we call centered kernel ridge regression (CKRR), also known in the literature as kernel ridge regression with offset. This modified technique is obtained by accounting for the bias in the regression problem resulting in the old kernel ridge regression but with centered kernels. The analysis is carried out under the assumption that the data is drawn from a Gaussian distribution and heavily relies on tools from random matrix theory (RMT). Under the regime in which the data dimension and the training size grow infinitely large with fixed ratio and under some mild assumptions controlling the data statistics, we show that both the empirical and the prediction risks converge to a deterministic quantities that describe in closed form fashion the performance of CKRR in terms of the data statistics and dimensions. Inspired by this theoretical result, we subsequently build a consistent estimator of the prediction risk based on the training data which allows to optimally tune the design parameters. A key insight of the proposed analysis is the fact that asymptotically a large class of kernels achieve the same minimum prediction risk. This insight is validated with both synthetic and real data.
AB - This paper carries out a large dimensional analysis of a variation of kernel ridge regression that we call centered kernel ridge regression (CKRR), also known in the literature as kernel ridge regression with offset. This modified technique is obtained by accounting for the bias in the regression problem resulting in the old kernel ridge regression but with centered kernels. The analysis is carried out under the assumption that the data is drawn from a Gaussian distribution and heavily relies on tools from random matrix theory (RMT). Under the regime in which the data dimension and the training size grow infinitely large with fixed ratio and under some mild assumptions controlling the data statistics, we show that both the empirical and the prediction risks converge to a deterministic quantities that describe in closed form fashion the performance of CKRR in terms of the data statistics and dimensions. Inspired by this theoretical result, we subsequently build a consistent estimator of the prediction risk based on the training data which allows to optimally tune the design parameters. A key insight of the proposed analysis is the fact that asymptotically a large class of kernels achieve the same minimum prediction risk. This insight is validated with both synthetic and real data.
UR - http://hdl.handle.net/10754/660647
UR - https://ieeexplore.ieee.org/document/9018066/
UR - http://www.scopus.com/inward/record.url?scp=85081412702&partnerID=8YFLogxK
U2 - 10.1109/TSP.2020.2975939
DO - 10.1109/TSP.2020.2975939
M3 - Article
SN - 1053-587X
VL - 68
SP - 1574
EP - 1588
JO - IEEE Transactions on Signal Processing
JF - IEEE Transactions on Signal Processing
ER -