TY - JOUR
T1 - Cost-sensitive design of quadratic discriminant analysis for imbalanced data
AU - Bejaoui, Amine
AU - Kammoun, Abla
AU - Alouini, Mohamed-Slim
AU - Al-Naffouri, Tareq Y.
N1 - KAUST Repository Item: Exported on 2021-07-08
PY - 2021/6/12
Y1 - 2021/6/12
N2 - Learning from imbalanced training data represents a major challenge that has triggered recent interest from both academia and industry. As far as classification is concerned, it has been observed that several algorithms provide low accuracy when designed out of imbalanced data sets, among which regularized quadratic discriminant analysis (R-QDA) is the most illustrative example. Based on recent asymptotic findings, the study in [2] has brought a better understanding of the reasons behind the excessive sensitivity of R-QDA to data imbalance, which allowed for the development of a novel quadratic based classifier that presents higher robustness to such scenarios. However, the selection of the parameters for this classifier relied on the minimization of the overall classification error rate, which is not considered as a relevant performance metric in extremely imbalanced training data. In this work, we follow a multi-model selection approach for the selection of the parameters of the classifier proposed in [2]. Such an approach involves solving a multi-objective optimization problem, but, contrary to related works, we do not resort to evolutionary algorithms to solve this problem but rather to a solely training data dependent technique based on asymptotic approximations for the classification performances. This allows us to transform the multi-objective optimization problem into a scalar optimization problem. Our proposed approach presents the main advantages of being more accurate and less complex, avoiding the need for computationally expensive cross-validation procedures. Its interest goes beyond the quadratic discriminant analysis, paving the way towards a principled method for the design of classification algorithms in imbalanced data scenarios.
AB - Learning from imbalanced training data represents a major challenge that has triggered recent interest from both academia and industry. As far as classification is concerned, it has been observed that several algorithms provide low accuracy when designed out of imbalanced data sets, among which regularized quadratic discriminant analysis (R-QDA) is the most illustrative example. Based on recent asymptotic findings, the study in [2] has brought a better understanding of the reasons behind the excessive sensitivity of R-QDA to data imbalance, which allowed for the development of a novel quadratic based classifier that presents higher robustness to such scenarios. However, the selection of the parameters for this classifier relied on the minimization of the overall classification error rate, which is not considered as a relevant performance metric in extremely imbalanced training data. In this work, we follow a multi-model selection approach for the selection of the parameters of the classifier proposed in [2]. Such an approach involves solving a multi-objective optimization problem, but, contrary to related works, we do not resort to evolutionary algorithms to solve this problem but rather to a solely training data dependent technique based on asymptotic approximations for the classification performances. This allows us to transform the multi-objective optimization problem into a scalar optimization problem. Our proposed approach presents the main advantages of being more accurate and less complex, avoiding the need for computationally expensive cross-validation procedures. Its interest goes beyond the quadratic discriminant analysis, paving the way towards a principled method for the design of classification algorithms in imbalanced data scenarios.
UR - http://hdl.handle.net/10754/670071
UR - https://linkinghub.elsevier.com/retrieve/pii/S0167865521001896
UR - http://www.scopus.com/inward/record.url?scp=85108591558&partnerID=8YFLogxK
U2 - 10.1016/j.patrec.2021.06.002
DO - 10.1016/j.patrec.2021.06.002
M3 - Article
SN - 0167-8655
VL - 149
SP - 24
EP - 29
JO - Pattern Recognition Letters
JF - Pattern Recognition Letters
ER -