Open Access
April 2010 A two-sample test for high-dimensional data with applications to gene-set testing
Song Xi Chen, Ying-Li Qin
Ann. Statist. 38(2): 808-835 (April 2010). DOI: 10.1214/09-AOS716

Abstract

We propose a two-sample test for the means of high-dimensional data when the data dimension is much larger than the sample size. Hotelling’s classical T2 test does not work for this “large p, small n” situation. The proposed test does not require explicit conditions in the relationship between the data dimension and sample size. This offers much flexibility in analyzing high-dimensional data. An application of the proposed test is in testing significance for sets of genes which we demonstrate in an empirical study on a leukemia data set.

Citation

Download Citation

Song Xi Chen. Ying-Li Qin. "A two-sample test for high-dimensional data with applications to gene-set testing." Ann. Statist. 38 (2) 808 - 835, April 2010. https://doi.org/10.1214/09-AOS716

Information

Published: April 2010
First available in Project Euclid: 19 February 2010

zbMATH: 1183.62095
MathSciNet: MR2604697
Digital Object Identifier: 10.1214/09-AOS716

Subjects:
Primary: 60K35 , 62H15
Secondary: 62G10

Keywords: gene-set testing , high dimension , large p small n , martingale central limit theorem , multiple comparison

Rights: Copyright © 2010 Institute of Mathematical Statistics

Vol.38 • No. 2 • April 2010
Back to Top