The Annals of Applied Statistics

Are a set of microarrays independent of each other?

Bradley Efron

Full-text: Open access

Abstract

Having observed an m×n matrix X whose rows are possibly correlated, we wish to test the hypothesis that the columns are independent of each other. Our motivation comes from microarray studies, where the rows of X record expression levels for m different genes, often highly correlated, while the columns represent n individual microarrays, presumably obtained independently. The presumption of independence underlies all the familiar permutation, cross-validation and bootstrap methods for microarray analysis, so it is important to know when independence fails. We develop nonparametric and normal-theory testing methods. The row and column correlations of X interact with each other in a way that complicates test procedures, essentially by reducing the accuracy of the relevant estimators.

Article information

Source
Ann. Appl. Stat., Volume 3, Number 3 (2009), 922-942.

Dates
First available in Project Euclid: 5 October 2009

Permanent link to this document
https://projecteuclid.org/euclid.aoas/1254773272

Digital Object Identifier
doi:10.1214/09-AOAS236

Mathematical Reviews number (MathSciNet)
MR2750220

Zentralblatt MATH identifier
1196.62138

Keywords
Total correlation effective sample size permutation tests matrix normal distribution row and column correlations

Citation

Efron, Bradley. Are a set of microarrays independent of each other?. Ann. Appl. Stat. 3 (2009), no. 3, 922--942. doi:10.1214/09-AOAS236. https://projecteuclid.org/euclid.aoas/1254773272


Export citation

References

  • Anderson, T. W. (2003). An Introduction to Multivariate Statistical Analysis, 3rd ed. Wiley, New York.
  • Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Roy. Statist. Soc. Ser. B 57 289–300.
  • Bolstad, B. M., Irizarry, R. A., Åstrand, M. and Speed, T. P. (2003). Comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19 185–193. Available at http://web.mit.edu/biomicro/education/RMA.pdf.
  • Callow, M., Dudoit, S., Gong, E., Speed, T. and Rubin, E. (2000). Microarray expression profiling identifies genes with altered expression in HDL-deficient mice. Genome Research 10 2022–2029.
  • Efron, B. (2004). Large-scale simultaneous hypothesis testing: The choice of a null hypothesis. J. Amer. Statist. Assoc. 99 96–104.
  • Efron, B. (2007a). Correlation and large-scale simultaneous significance testing. J. Amer. Statist. Assoc. 102 93–103.
  • Efron, B. (2007b). Size, power, and false discovery rates. Ann. Statist. 35 1351–1377.
  • Efron, B. (2008). Microarrays, empirical Bayes, and the two-groups model (with discussion and rejoinder). Statist. Sci. 23 1–47.
  • Johnson, D. E. and Graybill, F. A. (1972). An analysis of a two-way model with interaction and no replication. J. Amer. Statist. Assoc. 67 862–868.
  • Johnson, N. L. and Kotz, S. (1970). Continuous Univariate Distributions 1. Houghton Mifflin Company, Boston.
  • Mardia, K., Kent, J. and Bibby, J. (1979). Multivariate Analysis. Academic Press, London/San Diego.
  • Owen, A. B. (2005). Variance of the number of false discoveries. J. Roy. Statist. Soc. Ser. B 67 411–426.
  • Qiu, X., Brooks, A. I., Klebanov, L. and Yakovlev, A. (2005). The effects of normalization on the correlation structure of microarray data. BMC Bioinformatics 6 120. Available at http://www.biomedcentral.com/1471-2105/6/120.
  • Qiu, X., Klebanov, L. and Yakovlev, A. (2005). Correlation between gene expression levels and limitations of the empirical Bayes methodology for finding differentially expressed genes. Statist. Appl. Genet. Mol. Bio. 4, article 34. Available at http://www.bepress.com/sagmb/vol4/iss1/art34.
  • Singh, D., Febbo, P. G., Ross, K., Jackson, D. G., Manola, J., Ladd, C., Tamayo, P., Renshaw, A. A., D’Amico, A. V., Richie, J. P., Lander, E. S., Loda, M., Kantoff, P. W., Golub, T. R. and Sellers, W. R. (2002). Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1 203–209.
  • Tusher, V. G., Tibshirani, R. and Chu, G. (2001). Significance analysis of microarrays applied to the ionizing radiation response. Proc. Nat. Acad. Sci. USA 98 5116–5121. Available at http://www.pnas.org/cgi/content/full/98/9/5116.