The Annals of Statistics

Nonparametric estimation of genewise variance for microarray data

Jianqing Fan, Yang Feng, and Yue S. Niu

Full-text: Access has been disabled (more information)

Abstract

Estimation of genewise variance arises from two important applications in microarray data analysis: selecting significantly differentially expressed genes and validation tests for normalization of microarray data. We approach the problem by introducing a two-way nonparametric model, which is an extension of the famous Neyman–Scott model and is applicable beyond microarray data. The problem itself poses interesting challenges because the number of nuisance parameters is proportional to the sample size and it is not obvious how the variance function can be estimated when measurements are correlated. In such a high-dimensional nonparametric problem, we proposed two novel nonparametric estimators for genewise variance function and semiparametric estimators for measurement correlation, via solving a system of nonlinear equations. Their asymptotic normality is established. The finite sample property is demonstrated by simulation studies. The estimators also improve the power of the tests for detecting statistically differentially expressed genes. The methodology is illustrated by the data from microarray quality control (MAQC) project.

Article information

Source
Ann. Statist. Volume 38, Number 5 (2010), 2723-2750.

Dates
First available in Project Euclid: 11 July 2010

Permanent link to this document
http://projecteuclid.org/euclid.aos/1278861458

Digital Object Identifier
doi:10.1214/10-AOS802

Mathematical Reviews number (MathSciNet)
MR2722454

Zentralblatt MATH identifier
1200.62133

Subjects
Primary: 62G05: Estimation
Secondary: 62P10: Applications to biology and medical sciences

Keywords
Genewise variance estimation gene selection local linear regression nonparametric model correlation correction validation test

Citation

Fan, Jianqing; Feng, Yang; Niu, Yue S. Nonparametric estimation of genewise variance for microarray data. Ann. Statist. 38 (2010), no. 5, 2723--2750. doi:10.1214/10-AOS802. http://projecteuclid.org/euclid.aos/1278861458.


Export citation

References

  • Carroll, R. J. and Wang, Y. (2008). Nonparametric variance estimation in the analysis of microarray data: A measurement error approach. Biometrika 95 437–449.
  • Cui, X., Hwang, J. T. and Qiu, J. (2005). Improved statistical tests for differential gene expression by shrinking variance components estimates. Biostatistics 6 59–75.
  • Fan, J. and Gijbels, I. (1996). Local Polynomial Modelling and Its Applications. Chapman and Hall, London.
  • Fan, J. and Niu, Y. (2007). Selection and validation of normalization methods for c-DNA microarrays using within-array replications. Bioinformatics 23 2391–2398.
  • Fan, J., Peng, H. and Huang, T. (2005). Semilinear high-dimensional model for normalization of microarray data: A theoretical analysis and partial consistency (with discussion). J. Amer. Statist. Assoc. 100 781–813.
  • Fan, J. and Ren, Y. (2007). Statistical analysis of DNA microarray data in cancer research. Clinical Cancer Research 12 4469–4473.
  • Fan, J., Tam, P., Vande Woude, G. and Ren, Y. (2004). Normalization and analysis of cDNA micro-arrays using within-array replications applied to neuroblastoma cell response to a cytokine. Proc. Natl. Acad. Sci. USA 101 1135–1140.
  • Huang, J., Wang, D. and Zhang, C. (2005). A two-way semi-linear model for normalization and significant analysis of cDNA microarray data. J. Amer. Statist. Assoc. 100 814–829.
  • Kamb, A. and Ramaswami, A. (2001). A simple method for statistical analysis of intensity differences in microarray-deried gene expression data. BMC Biotechnology 1 8.
  • Neyman, J. and Scott, E. (1948). Consistent estimates based on partially consistent observations. Econometrica 16 1–32.
  • Patterson, T. et al. (2006). Performance comparison of one-color and two-color platforms within the MicroArray Quality Control (MAQC) project. Nature Biotechnology 24 1140–1150.
  • Ruppert, D., Wand, M. P., Holst, U. and Hössjer, O. (1997). Local polynomial variance function estimation. Technometrics 39 262–273.
  • Smyth, G., Michaud, J. and Scott, H. (2005). Use of within-array replicate spots for assessing differential expression in microarray experiments. Bioinformatics 21 2067–2075.
  • Storey, J. D. and Tibshirani, R. (2003). Statistical significance for genome-wide studies. Proc. Natl. Acad. Sci. USA 100 9440–9445.
  • Tong, T. and Wang, Y. (2007). Optimal shrinkage estimation of variances with applications to microarray data analysis. J. Amer. Statist. Assoc. 102 113–122.
  • Tusher, V. G., Tibshirani, R. and Chu, G. (2001). Significance analysis of microarrays applied to the ionizing radiation response. Proc. Natl. Acad. Sci. 98 5116–5121.
  • Wang, Y., Ma, Y. and Carroll, R. J. (2009). Variance estimation in the analysis of microarray data. J. Roy. Statist. Soc. Ser. B 71 425–445.