Abstract
We propose a novel systematic procedure of non-linear data transformation for an adaptive algorithm in the context of network reverse-engineering using information theoretic methods. Our methodology is rooted in elucidating and correcting for the specific biases in the estimation techniques for mutual information (MI) given a finite sample of data. These are, in turn, tied to lack of well-defined bounds for numerical estimation of MI for continuous probability distributions from finite data. The nature and properties of the inevitable bias is described, complemented by several examples illustrating their form and variation. We propose an adaptive partitioning scheme for MI estimation that effectively transforms the sample data using parameters determined from its local and global distribution guaranteeing a more robust and reliable reconstruction algorithm. Together with a normalized measure (Shared Information Metric) we report considerably enhanced performance both for in silico and real-world biological networks. We also find that the recovery of true interactions is in particular better for intermediate range of false positive rates, suggesting that our algorithm is less vulnerable to spurious signals of association.
Original language | English (US) |
---|---|
Pages (from-to) | 507-520 |
Number of pages | 14 |
Journal | Statistical Applications in Genetics and Molecular Biology |
Volume | 15 |
Issue number | 6 |
DOIs | |
State | Published - Dec 1 2016 |
Keywords
- Algoithms
- mutual information
- numerical estimation
ASJC Scopus subject areas
- Computational Mathematics
- Genetics
- Molecular Biology
- Statistics and Probability