The Annals of Applied Statistics

Are a set of microarrays independent of each other?

Bradley Efron

Source: Ann. Appl. Stat. Volume 3, Number 3 (2009), 922-942.

Abstract

Having observed an m×n matrix X whose rows are possibly correlated, we wish to test the hypothesis that the columns are independent of each other. Our motivation comes from microarray studies, where the rows of X record expression levels for m different genes, often highly correlated, while the columns represent n individual microarrays, presumably obtained independently. The presumption of independence underlies all the familiar permutation, cross-validation and bootstrap methods for microarray analysis, so it is important to know when independence fails. We develop nonparametric and normal-theory testing methods. The row and column correlations of X interact with each other in a way that complicates test procedures, essentially by reducing the accuracy of the relevant estimators.

Keywords: Total correlation; effective sample size; permutation tests; matrix normal distribution; row and column correlations

Full-text: Access denied (no subscription detected)

In 2007, access to the Annals of Applied Statistics was open. Beginning in 2008, you must hold a subscription or be a member of the IMS to view the full journal. For more information on subscribing, please visit: http://imstat.org/orders.
If you are already an IMS member, you may need to update your Euclid profile following the instructions here: http://imstat.org/publications/eaccess.htm.
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.aoas/1254773272
Digital Object Identifier: doi:10.1214/09-AOAS236

References

Anderson, T. W. (2003). An Introduction to Multivariate Statistical Analysis, 3rd ed. Wiley, New York.
Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Roy. Statist. Soc. Ser. B 57 289–300.
Bolstad, B. M., Irizarry, R. A., Åstrand, M. and Speed, T. P. (2003). Comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19 185–193. Available at http://web.mit.edu/biomicro/education/RMA.pdf.
Callow, M., Dudoit, S., Gong, E., Speed, T. and Rubin, E. (2000). Microarray expression profiling identifies genes with altered expression in HDL-deficient mice. Genome Research 10 2022–2029.
Efron, B. (2004). Large-scale simultaneous hypothesis testing: The choice of a null hypothesis. J. Amer. Statist. Assoc. 99 96–104.
Efron, B. (2007a). Correlation and large-scale simultaneous significance testing. J. Amer. Statist. Assoc. 102 93–103.
Efron, B. (2007b). Size, power, and false discovery rates. Ann. Statist. 35 1351–1377.
Efron, B. (2008). Microarrays, empirical Bayes, and the two-groups model (with discussion and rejoinder). Statist. Sci. 23 1–47.
Johnson, D. E. and Graybill, F. A. (1972). An analysis of a two-way model with interaction and no replication. J. Amer. Statist. Assoc. 67 862–868.
Johnson, N. L. and Kotz, S. (1970). Continuous Univariate Distributions 1. Houghton Mifflin Company, Boston.
Mardia, K., Kent, J. and Bibby, J. (1979). Multivariate Analysis. Academic Press, London/San Diego.
Owen, A. B. (2005). Variance of the number of false discoveries. J. Roy. Statist. Soc. Ser. B 67 411–426.
Qiu, X., Brooks, A. I., Klebanov, L. and Yakovlev, A. (2005). The effects of normalization on the correlation structure of microarray data. BMC Bioinformatics 6 120. Available at http://www.biomedcentral.com/1471-2105/6/120.
Qiu, X., Klebanov, L. and Yakovlev, A. (2005). Correlation between gene expression levels and limitations of the empirical Bayes methodology for finding differentially expressed genes. Statist. Appl. Genet. Mol. Bio. 4, article 34. Available at http://www.bepress.com/sagmb/vol4/iss1/art34.
Singh, D., Febbo, P. G., Ross, K., Jackson, D. G., Manola, J., Ladd, C., Tamayo, P., Renshaw, A. A., D’Amico, A. V., Richie, J. P., Lander, E. S., Loda, M., Kantoff, P. W., Golub, T. R. and Sellers, W. R. (2002). Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1 203–209.
Tusher, V. G., Tibshirani, R. and Chu, G. (2001). Significance analysis of microarrays applied to the ionizing radiation response. Proc. Nat. Acad. Sci. USA 98 5116–5121. Available at http://www.pnas.org/cgi/content/full/98/9/5116.

2009 © Institute of Mathematical Statistics