TY - JOUR
T1 - Semi-Supervised KPCA-Based Monitoring Techniques for Detecting COVID-19 Infection through Blood Tests
AU - Harrou, Fouzi
AU - Dairi, Abdelkader
AU - Dorbane, Abdelhakim
AU - Kadri, Farid
AU - Sun, Ying
N1 - KAUST Repository Item: Exported on 2023-05-04
Acknowledgements: The authors would like to acknowledge the support of the King Abdullah University of Science and Technology (KAUST) in conducting this research.
PY - 2023/4/18
Y1 - 2023/4/18
N2 - This study introduces a new method for identifying COVID-19 infections using blood test data as part of an anomaly detection problem by combining the kernel principal component analysis (KPCA) and one-class support vector machine (OCSVM). This approach aims to differentiate healthy individuals from those infected with COVID-19 using blood test samples. The KPCA model is used to identify nonlinear patterns in the data, and the OCSVM is used to detect abnormal features. This approach is semi-supervised as it uses unlabeled data during training and only requires data from healthy cases. The method’s performance was tested using two sets of blood test samples from hospitals in Brazil and Italy. Compared to other semi-supervised models, such as KPCA-based isolation forest (iForest), local outlier factor (LOF), elliptical envelope (EE) schemes, independent component analysis (ICA), and PCA-based OCSVM, the proposed KPCA-OSVM approach achieved enhanced discrimination performance for detecting potential COVID-19 infections. For the two COVID-19 blood test datasets that were considered, the proposed approach attained an AUC (area under the receiver operating characteristic curve) of 0.99, indicating a high accuracy level in distinguishing between positive and negative samples based on the test results. The study suggests that this approach is a promising solution for detecting COVID-19 infections without labeled data.
AB - This study introduces a new method for identifying COVID-19 infections using blood test data as part of an anomaly detection problem by combining the kernel principal component analysis (KPCA) and one-class support vector machine (OCSVM). This approach aims to differentiate healthy individuals from those infected with COVID-19 using blood test samples. The KPCA model is used to identify nonlinear patterns in the data, and the OCSVM is used to detect abnormal features. This approach is semi-supervised as it uses unlabeled data during training and only requires data from healthy cases. The method’s performance was tested using two sets of blood test samples from hospitals in Brazil and Italy. Compared to other semi-supervised models, such as KPCA-based isolation forest (iForest), local outlier factor (LOF), elliptical envelope (EE) schemes, independent component analysis (ICA), and PCA-based OCSVM, the proposed KPCA-OSVM approach achieved enhanced discrimination performance for detecting potential COVID-19 infections. For the two COVID-19 blood test datasets that were considered, the proposed approach attained an AUC (area under the receiver operating characteristic curve) of 0.99, indicating a high accuracy level in distinguishing between positive and negative samples based on the test results. The study suggests that this approach is a promising solution for detecting COVID-19 infections without labeled data.
UR - http://hdl.handle.net/10754/691425
UR - https://www.mdpi.com/2075-4418/13/8/1466
U2 - 10.3390/diagnostics13081466
DO - 10.3390/diagnostics13081466
M3 - Article
C2 - 37189568
SN - 2075-4418
VL - 13
SP - 1466
JO - Diagnostics
JF - Diagnostics
IS - 8
ER -