TY - JOUR
T1 - Robust clustering for functional data based on trimming and constraints
AU - Rivera-García, Diego
AU - García-Escudero, Luis A.
AU - Mayo-Iscar, Agustín
AU - Ortega, Joaquín
N1 - Generated from Scopus record by KAUST IRTS on 2019-11-20
PY - 2018/2/3
Y1 - 2018/2/3
N2 - Many clustering algorithms when the data are curves or functions have been recently proposed. However, the presence of contamination in the sample of curves can influence the performance of most of them. In this work we propose a robust, model-based clustering method that relies on an approximation to the “density function” for functional data. The robustness follows from the joint application of data-driven trimming, for reducing the effect of contaminated observations, and constraints on the variances, for avoiding spurious clusters in the solution. The algorithm is designed to perform clustering and outlier detection simultaneously by maximizing a trimmed “pseudo” likelihood. The proposed method has been evaluated and compared with other existing methods through a simulation study. Better performance for the proposed methodology is shown when a fraction of contaminating curves is added to a non-contaminated sample. Finally, an application to a real data set that has been previously considered in the literature is given.
AB - Many clustering algorithms when the data are curves or functions have been recently proposed. However, the presence of contamination in the sample of curves can influence the performance of most of them. In this work we propose a robust, model-based clustering method that relies on an approximation to the “density function” for functional data. The robustness follows from the joint application of data-driven trimming, for reducing the effect of contaminated observations, and constraints on the variances, for avoiding spurious clusters in the solution. The algorithm is designed to perform clustering and outlier detection simultaneously by maximizing a trimmed “pseudo” likelihood. The proposed method has been evaluated and compared with other existing methods through a simulation study. Better performance for the proposed methodology is shown when a fraction of contaminating curves is added to a non-contaminated sample. Finally, an application to a real data set that has been previously considered in the literature is given.
UR - http://link.springer.com/10.1007/s11634-018-0312-7
UR - http://www.scopus.com/inward/record.url?scp=85045144182&partnerID=8YFLogxK
U2 - 10.1007/s11634-018-0312-7
DO - 10.1007/s11634-018-0312-7
M3 - Article
SN - 1862-5355
JO - Advances in Data Analysis and Classification
JF - Advances in Data Analysis and Classification
ER -