TY - JOUR
T1 - A Large Dimensional Study of Regularized Discriminant Analysis
AU - Elkhalil, Khalil
AU - Kammoun, Abla
AU - Couillet, Romain
AU - Al-Naffouri, Tareq Y.
AU - Alouini, Mohamed-Slim
N1 - KAUST Repository Item: Exported on 2020-10-01
PY - 2020
Y1 - 2020
N2 - In this paper, we conduct a large dimensional study of regularized discriminant analysis classifiers with its two popular variants known as regularized LDA and regularized QDA. The analysis is based on the assumption that the data samples are drawn from a Gaussian mixture model with different means and covariances and relies on tools from random matrix theory (RMT). We consider the regime in which both the data dimension and training size within each class tends to infinity with fixed ratio. Under mild assumptions, we show that the probability of misclassification converges to a deterministic quantity that describes in closed form the performance of these classifiers in terms of the class statistics as well as the problem dimension. The result allows for a better understanding of the underlying classification algorithms in terms of their performances in practical large but finite dimensions. Further exploitation of the results permits to optimally tune the regularization parameter with the aim of minimizing the probability of misclassification. The analysis is validated with numerical results involving synthetic as well as real data from the USPS dataset yielding a high accuracy in predicting the performances and hence making an interesting connection between theory and practice.
AB - In this paper, we conduct a large dimensional study of regularized discriminant analysis classifiers with its two popular variants known as regularized LDA and regularized QDA. The analysis is based on the assumption that the data samples are drawn from a Gaussian mixture model with different means and covariances and relies on tools from random matrix theory (RMT). We consider the regime in which both the data dimension and training size within each class tends to infinity with fixed ratio. Under mild assumptions, we show that the probability of misclassification converges to a deterministic quantity that describes in closed form the performance of these classifiers in terms of the class statistics as well as the problem dimension. The result allows for a better understanding of the underlying classification algorithms in terms of their performances in practical large but finite dimensions. Further exploitation of the results permits to optimally tune the regularization parameter with the aim of minimizing the probability of misclassification. The analysis is validated with numerical results involving synthetic as well as real data from the USPS dataset yielding a high accuracy in predicting the performances and hence making an interesting connection between theory and practice.
UR - http://hdl.handle.net/10754/662431
UR - https://ieeexplore.ieee.org/document/9055087/
UR - http://www.scopus.com/inward/record.url?scp=85085173110&partnerID=8YFLogxK
U2 - 10.1109/TSP.2020.2984160
DO - 10.1109/TSP.2020.2984160
M3 - Article
SN - 1941-0476
VL - 68
SP - 1
EP - 1
JO - IEEE Transactions on Signal Processing
JF - IEEE Transactions on Signal Processing
ER -