On Some Fast And Robust Classifiers For High Dimension, Low Sample Size Data

Sarbojit Roy, Jyotishka Ray Choudhury, Subhajit Dutta

Research output: Contribution to conferencePaperpeer-review

2 Scopus citations

Abstract

In high dimension, low sample size (HDLSS) settings, distance concentration phenomena affects the performance of several popular classifiers which are based on Euclidean distances. The behaviour of these classifiers in high dimensions is completely governed by the first and second order moments of the underlying class distributions. Moreover, the classifiers become useless for such HDLSS data when the first two moments of the competing distributions are equal, or when the moments do not exist. In this work, we propose robust, computationally efficient and tuning-free classifiers applicable in the HDLSS scenario. As the data dimension increases, these classifiers yield perfect classification if the one-dimensional marginals of the underlying distributions are different. We establish strong theoretical properties for the proposed classifiers in ultrahigh-dimensional settings. Numerical experiments with a wide variety of simulated examples and analysis of real data sets exhibit clear and convincing advantages over existing methods.

Original languageEnglish (US)
Pages9943-9968
Number of pages26
StatePublished - 2022
Event25th International Conference on Artificial Intelligence and Statistics, AISTATS 2022 - Virtual, Online, Spain
Duration: Mar 28 2022Mar 30 2022

Conference

Conference25th International Conference on Artificial Intelligence and Statistics, AISTATS 2022
Country/TerritorySpain
CityVirtual, Online
Period03/28/2203/30/22

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Statistics and Probability

Fingerprint

Dive into the research topics of 'On Some Fast And Robust Classifiers For High Dimension, Low Sample Size Data'. Together they form a unique fingerprint.

Cite this