Abstract
Hierarchical models are a powerful tool for high-throughput data with a small to moderate number of replicates, as they allow sharing information across units of information, for example, genes. We propose two such models and show its increased sensitivity in microarray differential expression applications. We build on the gamma–gamma hierarchical model introduced by Kendziorski et al. [Statist. Med. 22 (2003) 3899–3914] and Newton et al. [Biostatistics 5 (2004) 155–176], by addressing important limitations that may have hampered its performance and its more widespread use. The models parsimoniously describe the expression of thousands of genes with a small number of hyper-parameters. This makes them easy to interpret and analytically tractable. The first model is a simple extension that improves the fit substantially with almost no increase in complexity. We propose a second extension that uses a mixture of gamma distributions to further improve the fit, at the expense of increased computational burden. We derive several approximations that significantly reduce the computational cost. We find that our models outperform the original formulation of the model, as well as some other popular methods for differential expression analysis. The improved performance is specially noticeable for the small sample sizes commonly encountered in high-throughput experiments. Our methods are implemented in the freely available Bioconductor gaga package.
Citation
David Rossell. "GaGa: A parsimonious and flexible model for differential expression analysis." Ann. Appl. Stat. 3 (3) 1035 - 1051, September 2009. https://doi.org/10.1214/09-AOAS244
Information