Towards scalable and data efficient learning of Markov boundaries

Jose M. Peña*, Roland Nilsson, Johan Björkegren, Jesper Tegnér

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

176 Scopus citations

Abstract

We propose algorithms for learning Markov boundaries from data without having to learn a Bayesian network first. We study their correctness, scalability and data efficiency. The last two properties are important because we aim to apply the algorithms to identify the minimal set of features that is needed for probabilistic classification in databases with thousands of features but few instances, e.g. gene expression databases. We evaluate the algorithms on synthetic and real databases, including one with 139,351 features.

Original languageEnglish (US)
Pages (from-to)211-232
Number of pages22
JournalInternational Journal of Approximate Reasoning
Volume45
Issue number2
DOIs
StatePublished - Jul 2007
Externally publishedYes

Keywords

  • Bayesian networks
  • Classification
  • Feature subset selection

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Artificial Intelligence
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Towards scalable and data efficient learning of Markov boundaries'. Together they form a unique fingerprint.

Cite this