TY - JOUR
T1 - A biclustering algorithm for binary matrices based on penalized Bernoulli likelihood
AU - Lee, Seokho
AU - Huang, Jianhua Z.
N1 - KAUST Repository Item: Exported on 2020-10-01
Acknowledged KAUST grant number(s): KUS-CI-016-04
Acknowledgements: The authors would like to thank the editor, the associate editor, and two reviewers for helpful comments. Dr. Lan Zhou carefully read the paper and gave many useful suggestions for improving the writing. Lee’s work was supported by Basic Science Research Program through the National Research Foundation (NRF) of Korea (2011-0011608). Huang’s work was partially supported by NCI (CA57030), NSF (DMS-0907170, DMS-1007618, DMS-1208952), and King Abdullah University of Science and Technology (KUS-CI-016-04).
This publication acknowledges KAUST support, but has no KAUST affiliated authors.
PY - 2013/1/31
Y1 - 2013/1/31
N2 - We propose a new biclustering method for binary data matrices using the maximum penalized Bernoulli likelihood estimation. Our method applies a multi-layer model defined on the logits of the success probabilities, where each layer represents a simple bicluster structure and the combination of multiple layers is able to reveal complicated, multiple biclusters. The method allows for non-pure biclusters, and can simultaneously identify the 1-prevalent blocks and 0-prevalent blocks. A computationally efficient algorithm is developed and guidelines are provided for specifying the tuning parameters, including initial values of model parameters, the number of layers, and the penalty parameters. Missing-data imputation can be handled in the EM framework. The method is tested using synthetic and real datasets and shows good performance. © 2013 Springer Science+Business Media New York.
AB - We propose a new biclustering method for binary data matrices using the maximum penalized Bernoulli likelihood estimation. Our method applies a multi-layer model defined on the logits of the success probabilities, where each layer represents a simple bicluster structure and the combination of multiple layers is able to reveal complicated, multiple biclusters. The method allows for non-pure biclusters, and can simultaneously identify the 1-prevalent blocks and 0-prevalent blocks. A computationally efficient algorithm is developed and guidelines are provided for specifying the tuning parameters, including initial values of model parameters, the number of layers, and the penalty parameters. Missing-data imputation can be handled in the EM framework. The method is tested using synthetic and real datasets and shows good performance. © 2013 Springer Science+Business Media New York.
UR - http://hdl.handle.net/10754/597220
UR - http://link.springer.com/10.1007/s11222-013-9379-3
UR - http://www.scopus.com/inward/record.url?scp=84898547233&partnerID=8YFLogxK
U2 - 10.1007/s11222-013-9379-3
DO - 10.1007/s11222-013-9379-3
M3 - Article
SN - 0960-3174
VL - 24
SP - 429
EP - 441
JO - Statistics and Computing
JF - Statistics and Computing
IS - 3
ER -