Marginal asymptotics for the “large $p$, small $n$” paradigm: With applications to microarray data

Michael R. Kosorok; Shuangge Ma

doi:10.1214/009053606000001433

August 2007 Marginal asymptotics for the “large $p$, small $n$” paradigm: With applications to microarray data

Michael R. Kosorok, Shuangge Ma

Ann. Statist. 35(4): 1456-1486 (August 2007). DOI: 10.1214/009053606000001433

Abstract

The “large $p$, small $n$” paradigm arises in microarray studies, image analysis, high throughput molecular screening, astronomy, and in many other high dimensional applications. False discovery rate (FDR) methods are useful for resolving the accompanying multiple testing problems. In cDNA microarray studies, for example, $p$-values may be computed for each of $p$ genes using data from $n$ arrays, where typically $p$ is in the thousands and $n$ is less than 30. For FDR methods to be valid in identifying differentially expressed genes, the $p$-values for the nondifferentially expressed genes must simultaneously have uniform distributions marginally. While feasible for permutation $p$-values, this uniformity is problematic for asymptotic based $p$-values since the number of $p$-values involved goes to infinity and intuition suggests that at least some of the $p$-values should behave erratically. We examine this neglected issue when $n$ is moderately large but $p$ is almost exponentially large relative to $n$. We show the somewhat surprising result that, under very general dependence structures and for both mean and median tests, the $p$-values are simultaneously valid. A small simulation study and data analysis are used for illustration.

Citation

Download Citation

Michael R. Kosorok. Shuangge Ma. "Marginal asymptotics for the “large $p$, small $n$” paradigm: With applications to microarray data." Ann. Statist. 35 (4) 1456 - 1486, August 2007. https://doi.org/10.1214/009053606000001433

Information

Published: August 2007

First available in Project Euclid: 29 August 2007

zbMATH: 1123.62005

MathSciNet: MR2351093

Digital Object Identifier: 10.1214/009053606000001433

Subjects:

Primary: 62A01 , 62H15

Secondary: 62G20 , 62G30

Keywords: Brownian bridge , Brownian motion , empirical process , False discovery rate , Hungarian construction , marginal asymptotics , Maximal inequalities , median tests , microarrays , t-tests

Access the abstract

JOURNAL ARTICLE
31 PAGES

DOWNLOAD PDF + SAVE TO MY LIBRARY