Abstract
We analyze two different feature selection problems: finding a minimal feature set optimal for classification (MINIMAL-OPTIMAL) vs. finding all features relevant to the target variable (ALL-RELEVANT). The latter problem is motivated by recent applications within bioinformatics, particularly gene expression analysis. For both problems, we identify classes of data distributions for which there exist consistent, polynomial-time algorithms. We also prove that ALL-RELEVANT is much harder than MINIMAL-OPTIMAL and propose two consistent, polynomial-time algorithms. We argue that the distribution classes considered are reasonable in many practical cases, so that our results simplify feature selection in a wide range of machine learning tasks.
Original language | English (US) |
---|---|
Pages (from-to) | 589-612 |
Number of pages | 24 |
Journal | Journal of Machine Learning Research |
Volume | 8 |
State | Published - Mar 2007 |
Externally published | Yes |
Keywords
- Bioinformatics
- Classification
- Learning theory
- Markov blanket
- Relevance
ASJC Scopus subject areas
- Software
- Control and Systems Engineering
- Statistics and Probability
- Artificial Intelligence