The Annals of Applied Statistics

On testing the significance of sets of genes

Bradley Efron and Robert Tibshirani

Full-text: Open access

Abstract

This paper discusses the problem of identifying differentially expressed groups of genes from a microarray experiment. The groups of genes are externally defined, for example, sets of gene pathways derived from biological databases. Our starting point is the interesting Gene Set Enrichment Analysis (GSEA) procedure of Subramanian et al. [Proc. Natl. Acad. Sci. USA 102 (2005) 15545–15550]. We study the problem in some generality and propose two potential improvements to GSEA: the maxmean statistic for summarizing gene-sets, and restandardization for more accurate inferences. We discuss a variety of examples and extensions, including the use of gene-set scores for class predictions. We also describe a new R language package GSA that implements our ideas.

Article information

Source
Ann. Appl. Stat. Volume 1, Number 1 (2007), 107-129.

Dates
First available: 29 June 2007

Permanent link to this document
http://projecteuclid.org/euclid.aoas/1183143731

Digital Object Identifier
doi:10.1214/07-AOAS101

Mathematical Reviews number (MathSciNet)
MR2393843

Zentralblatt MATH identifier
1129.62102

Citation

Efron, Bradley; Tibshirani, Robert. On testing the significance of sets of genes. The Annals of Applied Statistics 1 (2007), no. 1, 107--129. doi:10.1214/07-AOAS101. http://projecteuclid.org/euclid.aoas/1183143731.


Export citation