The Annals of Applied Statistics

A statistical framework for testing functional categories in microarray data

William T. Barry, Andrew B. Nobel, and Fred A. Wright

Full-text: Open access


Ready access to emerging databases of gene annotation and functional pathways has shifted assessments of differential expression in DNA microarray studies from single genes to groups of genes with shared biological function. This paper takes a critical look at existing methods for assessing the differential expression of a group of genes (functional category), and provides some suggestions for improved performance. We begin by presenting a general framework, in which the set of genes in a functional category is compared to the complementary set of genes on the array. The framework includes tests for overrepresentation of a category within a list of significant genes, and methods that consider continuous measures of differential expression. Existing tests are divided into two classes. Class 1 tests assume gene-specific measures of differential expression are independent, despite overwhelming evidence of positive correlation. Analytic and simulated results are presented that demonstrate Class 1 tests are strongly anti-conservative in practice. Class 2 tests account for gene correlation, typically through array permutation that by construction has proper Type I error control for the induced null. However, both Class 1 and Class 2 tests use a null hypothesis that all genes have the same degree of differential expression. We introduce a more sensible and general (Class 3) null under which the profile of differential expression is the same within the category and complement. Under this broader null, Class 2 tests are shown to be conservative. We propose standard bootstrap methods for testing against the Class 3 null and demonstrate they provide valid Type I error control and more power than array permutation in simulated datasets and real microarray experiments.

Article information

Ann. Appl. Stat., Volume 2, Number 1 (2008), 286-315.

First available in Project Euclid: 24 March 2008

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Differential expression array permutation bootstrap Type I error power


Barry, William T.; Nobel, Andrew B.; Wright, Fred A. A statistical framework for testing functional categories in microarray data. Ann. Appl. Stat. 2 (2008), no. 1, 286--315. doi:10.1214/07-AOAS146.

Export citation


  • Allison, D. B., Cui, X. Q., Page, G. P. et al. (2006). Microarray data analysis: From disarray to consolidation and consensus., Nature Reviews Genetics 7 55–65.
  • Ashburner, M., Ball, C. A., Blake, J. A., Botstein, D., Butler, H., Cherry, J. M., Davis, A. P., Dolinski, K., Dwight, S. S., Eppig, J. T., Harris, M. A., Hill, D. P., Issel-Tarver, L., Kasarskis, A., Lewis, S., Matese, J. C., Richardson, J. E., Ringwald, M., Rubin, G. M. and Sherlock, G. (2000). Gene Ontology: Tool for the unification of biology. The Gene Ontology Consortium., Nat. Genet. 25 25–29.
  • Barry, W. T., Nobel, A. B. and Wright, F. A. (2005). Significance analysis of functional categories in gene expression studies: A structured permutation approach., Bioinformatics 21 1943–1949.
  • Barry, W. T., Nobel, A. B. and Wright, F. A. (2008). Supplement to “A statistical framework for testing functional categories in microarray data.” DOI: 10.1214/07-AOAS146SUPPA, DOI:, 10.1214/07-AOAS146SUPPB.
  • Beißbarth, T. and Speed, T. P. (2004). GOstat: Find statistically overrepresented Gene Ontologies within a group of genes., Bioinformatics 20 1464–1465.
  • Ben-Shaul, Y., Bergman, H. and Soreq, H. (2005). Identifying subtle interrelated changes in functional gene categories using continuous measures of gene expression., Bioinformatics 21 1129–1137.
  • Bhattacharjee, A., Richards, W. G., Staunton, J., Li, C., Monti, S., Vasa, P., Ladd, C., Beheshti, J., Bueno, R., Gillette, M., Loda, M., Weber, G., Mark, E. J., Lander, E. S., Wong, W., Johnson, B. E., Golub, T. R., Sugarbaker, D. J. and Meyerson, M. (2001). Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses., Proc. Natl. Acad. Sci. USA 98 13790–13795.
  • Boorsma, A., Foat, B. C., Vis, D., Klis, F. and Bussemaker, H. J. (2005). T-profiler: scoring the activity of predefined groups of genes using gene expression data., Nucleic Acids Research 33 W592–W595.
  • Casella, G. and Berger, R. L. (2002)., Statistical Inference, 2nd ed. Duxbury, Australia.
  • Chang, H. Y., Nuyten, D. S. A., Sneddon, J. B., Hastie, T., Tibshirani, R., Sorlie, T., Dai, H. Y., He, Y. D. D., Veer, L. J. V., Bartelink, H., de Rijn, M. V., Brown, P. O. and de Vijver, M. J. V. (2005). Robustness, scalability, and integration of a wound-response gene expression signature in predicting breast cancer survival., Proc. Natl. Acad. Sci. USA 102 3738–3743.
  • Damian, D. and Gorfine, M. (2004). Statistical concerns about the GSEA procedure., Nature Genetics 36 663.
  • Dodd, L. E., Sengupta, S., Chen, I. H., den Boon, J. A., Cheng, Y. J., Westra, W., Newton, M. A., Mittl, B. F., McShane, L., Chen, C. J., Ahlquist, P. and Hildesheim, A. (2006). Genes involved in DNA repair and nitrosamine metabolism and those located on chromosome 14q32 are dysregulated in nasopharyngeal carcinoma., Cancer Epidemiology Biomarkers and Prevention 15 2216–2225.
  • Draghici, S., Khatri, P., Martins, R. P., Ostermeier, G. C. and Krawetz, S. A. (2003). Global functional profiling of gene expression., Genomics 81 98–104.
  • Dudoit, S., Keles, S. and van der Laan, M. J. (2007)., Multiple Tests of Association with Biological Annotation Metadata. Springer, New York.
  • Dudoit, S., Yang, Y. H., Callow, M. J. and Speed, T. P. (2002). Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments., Statist. Sinica 12 111–139.
  • Efron, B. (1979). Bootstrap methods: Another look at the jackknife., Ann. Statist. 7 1–26.
  • Efron, B. (1987). Better bootstrap confidence intervals., J. Amer. Statist. Assoc. 82 171–185.
  • Efron, B. and Tibshirani, R. J. (1998)., An Introduction to the Bootstrap, 2nd ed. Chapman and Hall/CRC, New York.
  • Efron, B. and Tibshirani, R. (2007). On testing the significance of sets of genes., Ann. Applied Statist. 1 107–129.
  • Galitski, T., Saldanha, A. J., Styles, C. A., Lander, E. S. and Fink, G. R. (1999). Ploidy regulation of gene expression., Science 285 251–254.
  • Gastwirth, J. L. and Rubin, H. (1971). Effect of dependence on the level of some one-sample tests., J. Amer. Statist. Assoc. 66 816–820.
  • Goeman, J. J. and Buhlmann, P. (2007). Analyzing gene expression data in terms of gene sets: Methodological issues., Bioinformatics 23 980–987.
  • Hall, P. and Wilson, S. R. (1991). Two guidelines for bootstrap hypothesis testing., Biometrics 47 757–762.
  • Kim, S.-Y. and Volsky, D. J. (2005). Parametric analysis of gene set enrichment., BMC Bioinformatics 6 144.
  • Lee, H. K., Hsu, A. K., Sajdak, J., Qin, J. and Pavlidis, P. (2004). Coexpression analysis of human genes across many microarray data sets., Genome Research 14 1085–1094.
  • Mootha, V. K., Lindgren, C. M., Eriksson, K. F., Subramanian, A., Sihag, S., Lehar, J., Puigserver, P., Carlsson, E., Ridderstrale, M., Laurila, E., Houstis, N., Daly, M. J., Patterson, N., Mesirov, J. P., Golub, T. R., Tamayo, P., Spiegelman, B., Lander, E. S., Hirschhorn, J. N., Altshuler, D. and Groop, L. C. (2003). PGC-1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes., Nat. Genet. 34 267–273.
  • Newton, M. A., Noueiry, A., Sarkar, D. and Ahlquist, P. (2004). Detecting differential gene expression with a semiparametric hierarchical mixture method., Biostatistics 5 155–176.
  • Pavlidis, P., Qin, J., Arango, V., Mann, J. J. and Sibille, E. (2004). Using the gene ontology for microarray data mining: A comparison of methods and application to age effects in human prefrontal cortex., Neurochemical Research 29 1213–1222.
  • Pearson, K. (1911). on the probability that two independent distributions of frequency are really samples from the same population., Biometrika 8 250–254.
  • Subramanian, A., Tamayo, P., Mootha, V. K., Mukherjee, S., Ebert, B. L., Gillette, M. A., Paulovich, A., Pomeroy, S. L., Golub, T. R., Lander, E. S. and Mesirov, J. P. (2005). Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles., Proc. Natl. Acad. Sci. USA 102 15545–15550.
  • Thomas, G. B. J. and Finney, R. L. (1992)., Maxima, Minima, and Saddle Points, 8th ed. Addison-Wesley, Reading, MA.
  • Tusher, V. G., Tibshirani, R. and Chu, G. (2001). Significance analysis of microarrays applied to the ionizing radiation response., Proc. Natl. Acad. Sci. USA 98 5116–5121.
  • Virtaneva, K. I., Wright, F. A., Tanner, S. M., Yuan, B., Lemon, W. J., Caligiuri, M. A., Bloomfield, C. D., de la Chapelle, A. and Krahe, R. (2001). Expression profiling reveals fundamental biological differences in acute myeloid leukemia with isolated trisomy 8 and normal cytogenetics., Proc. Natl. Acad. Sci. USA 98 1124–1129.
  • Zhong, S., Storch, K. F., Lipan, O., Kao, M. C., Weitz, C. J. and Wong, W. H. (2004). GoSurfer: A graphical interactive tool for comparative analysis of large gene sets in gene ontology space., Appl. Bioinformatics 3 261–264.

Supplemental materials