The Annals of Applied Statistics

Are a set of microarrays independent of each other?

Bradley Efron
Source: Ann. Appl. Stat. Volume 3, Number 3 (2009), 922-942.

Abstract

Having observed an m×n matrix X whose rows are possibly correlated, we wish to test the hypothesis that the columns are independent of each other. Our motivation comes from microarray studies, where the rows of X record expression levels for m different genes, often highly correlated, while the columns represent n individual microarrays, presumably obtained independently. The presumption of independence underlies all the familiar permutation, cross-validation and bootstrap methods for microarray analysis, so it is important to know when independence fails. We develop nonparametric and normal-theory testing methods. The row and column correlations of X interact with each other in a way that complicates test procedures, essentially by reducing the accuracy of the relevant estimators.

First Page: Show Hide
Full-text: Access denied (no subscription detected)
In 2007, access to the Annals of Applied Statistics was open. Beginning in 2008, you must hold a subscription or be a member of the IMS to view the full journal. For more information on subscribing, please visit: http://imstat.org/orders.
If you are already an IMS member, you may need to update your Euclid profile following the instructions here: http://imstat.org/publications/eaccess.htm.
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.aoas/1254773272
Digital Object Identifier: doi:10.1214/09-AOAS236
Zentralblatt MATH identifier: 05758445
Mathematical Reviews number (MathSciNet): MR2750220

References

Anderson, T. W. (2003). An Introduction to Multivariate Statistical Analysis, 3rd ed. Wiley, New York.
Mathematical Reviews (MathSciNet): MR1990662
Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Roy. Statist. Soc. Ser. B 57 289–300.
Mathematical Reviews (MathSciNet): MR1325392
Bolstad, B. M., Irizarry, R. A., Åstrand, M. and Speed, T. P. (2003). Comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19 185–193. Available at http://web.mit.edu/biomicro/education/RMA.pdf.
Callow, M., Dudoit, S., Gong, E., Speed, T. and Rubin, E. (2000). Microarray expression profiling identifies genes with altered expression in HDL-deficient mice. Genome Research 10 2022–2029.
Efron, B. (2004). Large-scale simultaneous hypothesis testing: The choice of a null hypothesis. J. Amer. Statist. Assoc. 99 96–104.
Mathematical Reviews (MathSciNet): MR2054289
Zentralblatt MATH: 1089.62502
Digital Object Identifier: doi:10.1198/016214504000000089
Efron, B. (2007a). Correlation and large-scale simultaneous significance testing. J. Amer. Statist. Assoc. 102 93–103.
Mathematical Reviews (MathSciNet): MR2293302
Zentralblatt MATH: 05191552
Digital Object Identifier: doi:10.1198/016214506000001211
Efron, B. (2007b). Size, power, and false discovery rates. Ann. Statist. 35 1351–1377.
Mathematical Reviews (MathSciNet): MR2351089
Zentralblatt MATH: 1123.62008
Digital Object Identifier: doi:10.1214/009053606000001460
Project Euclid: euclid.aos/1188405614
Efron, B. (2008). Microarrays, empirical Bayes, and the two-groups model (with discussion and rejoinder). Statist. Sci. 23 1–47.
Mathematical Reviews (MathSciNet): MR2431866
Digital Object Identifier: doi:10.1214/07-STS236
Project Euclid: euclid.ss/1215441276
Johnson, D. E. and Graybill, F. A. (1972). An analysis of a two-way model with interaction and no replication. J. Amer. Statist. Assoc. 67 862–868.
Mathematical Reviews (MathSciNet): MR400566
Zentralblatt MATH: 0254.62042
Digital Object Identifier: doi:10.2307/2284651
Johnson, N. L. and Kotz, S. (1970). Continuous Univariate Distributions 1. Houghton Mifflin Company, Boston.
Mardia, K., Kent, J. and Bibby, J. (1979). Multivariate Analysis. Academic Press, London/San Diego.
Owen, A. B. (2005). Variance of the number of false discoveries. J. Roy. Statist. Soc. Ser. B 67 411–426.
Mathematical Reviews (MathSciNet): MR2155346
Zentralblatt MATH: 1069.62102
Digital Object Identifier: doi:10.1111/j.1467-9868.2005.00509.x
Qiu, X., Brooks, A. I., Klebanov, L. and Yakovlev, A. (2005). The effects of normalization on the correlation structure of microarray data. BMC Bioinformatics 6 120. Available at http://www.biomedcentral.com/1471-2105/6/120.
Qiu, X., Klebanov, L. and Yakovlev, A. (2005). Correlation between gene expression levels and limitations of the empirical Bayes methodology for finding differentially expressed genes. Statist. Appl. Genet. Mol. Bio. 4, article 34. Available at http://www.bepress.com/sagmb/vol4/iss1/art34.
Mathematical Reviews (MathSciNet): MR2183944
Digital Object Identifier: doi:10.2202/1544-6115.1157
Singh, D., Febbo, P. G., Ross, K., Jackson, D. G., Manola, J., Ladd, C., Tamayo, P., Renshaw, A. A., D’Amico, A. V., Richie, J. P., Lander, E. S., Loda, M., Kantoff, P. W., Golub, T. R. and Sellers, W. R. (2002). Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1 203–209.
Tusher, V. G., Tibshirani, R. and Chu, G. (2001). Significance analysis of microarrays applied to the ionizing radiation response. Proc. Nat. Acad. Sci. USA 98 5116–5121. Available at http://www.pnas.org/cgi/content/full/98/9/5116.

2012 © Institute of Mathematical Statistics

The Annals of Applied Statistics

The Annals of Applied Statistics