Abstract
MOTIVATION: Quantitative mass spectrometry-based proteomics involves statistical inference on protein abundance, based on the intensities of each protein's associated spectral peaks. However, typical MS-based proteomics datasets have substantial proportions of missing observations, due at least in part to censoring of low intensities. This complicates intensity-based differential expression analysis. RESULTS: We outline a statistical method for protein differential expression, based on a simple Binomial likelihood. By modeling peak intensities as binary, in terms of 'presence/absence,' we enable the selection of proteins not typically amenable to quantitative analysis; e.g. 'one-state' proteins that are present in one condition but absent in another. In addition, we present an analysis protocol that combines quantitative and presence/absence analysis of a given dataset in a principled way, resulting in a single list of selected proteins with a single-associated false discovery rate. AVAILABILITY: All R code available here: http://www.stat.tamu.edu/~adabney/share/xuan_code.zip.
Original language | English (US) |
---|---|
Pages (from-to) | 1586-1591 |
Number of pages | 6 |
Journal | Bioinformatics |
Volume | 28 |
Issue number | 12 |
DOIs | |
State | Published - Apr 19 2012 |
Externally published | Yes |