Non-negative matrix factorization for discrete data with hierarchical side-information

Changwei Hu, Piyush Rai, Lawrence Carin

Research output: Chapter in Book/Report/Conference proceedingConference contribution

12 Scopus citations

Abstract

We present a probabilistic framework for efficient non-negative matrix factorization of discrete (count/binary) data with side-information. The side-information is given as a multi-level structure, taxonomy, or ontology, with nodes at each level being categorical-valued observations. For example, when modeling documents with a two-level side-information (documents being at level-zero), level-one may represent (one or more) authors associated with each document and level-two may represent affiliations of each author. The model easily generalizes to more than two levels (or taxonomy/ontology of arbitrary depth). Our model can learn embeddings of entities present at each level in the data/side-information hierarchy (e.g., documents, authors, affiliations, in the previous example), with appropriate sharing of information across levels. The model also enjoys full local conjugacy, facilitating efficient Gibbs sampling for model inference. Inference cost scales in the number of non-zero entries in the data matrix, which is especially appealing for real-world massive but sparse matrices. We demonstrate the effectiveness of the model on several real-world data sets.
Original languageEnglish (US)
Title of host publicationProceedings of the 19th International Conference on Artificial Intelligence and Statistics, AISTATS 2016
PublisherPMLR
Pages1124-1132
Number of pages9
StatePublished - Jan 1 2016
Externally publishedYes

Fingerprint

Dive into the research topics of 'Non-negative matrix factorization for discrete data with hierarchical side-information'. Together they form a unique fingerprint.

Cite this