Open Access
Translator Disclaimer
December 2007 Diverse correlation structures in gene expression data and their utility in improving statistical inference
Lev Klebanov, Andrei Yakovlev
Ann. Appl. Stat. 1(2): 538-559 (December 2007). DOI: 10.1214/07-AOAS120


It is well known that correlations in microarray data represent a serious nuisance deteriorating the performance of gene selection procedures. This paper is intended to demonstrate that the correlation structure of microarray data provides a rich source of useful information. We discuss distinct correlation substructures revealed in microarray gene expression data by an appropriate ordering of genes. These substructures include stochastic proportionality of expression signals in a large percentage of all gene pairs, negative correlations hidden in ordered gene triples, and a long sequence of weakly dependent random variables associated with ordered pairs of genes. The reported striking regularities are of general biological interest and they also have far-reaching implications for theory and practice of statistical methods of microarray data analysis. We illustrate the latter point with a method for testing differential expression of nonoverlapping gene pairs. While designed for testing a different null hypothesis, this method provides an order of magnitude more accurate control of type 1 error rate compared to conventional methods of individual gene expression profiling. In addition, this method is robust to the technical noise. Quantitative inference of the correlation structure has the potential to extend the analysis of microarray data far beyond currently practiced methods.


Download Citation

Lev Klebanov. Andrei Yakovlev. "Diverse correlation structures in gene expression data and their utility in improving statistical inference." Ann. Appl. Stat. 1 (2) 538 - 559, December 2007.


Published: December 2007
First available in Project Euclid: 30 November 2007

zbMATH: 1126.62105
MathSciNet: MR2415746
Digital Object Identifier: 10.1214/07-AOAS120

Keywords: Correlation structure , gene expression , microarrays

Rights: Copyright © 2007 Institute of Mathematical Statistics


Vol.1 • No. 2 • December 2007
Back to Top