The Annals of Statistics

Flexible results for quadratic forms with applications to variance components estimation

Lee H. Dicker and Murat A. Erdogdu

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

We derive convenient uniform concentration bounds and finite sample multivariate normal approximation results for quadratic forms, then describe some applications involving variance components estimation in linear random-effects models. Random-effects models and variance components estimation are classical topics in statistics, with a corresponding well-established asymptotic theory. However, our finite sample results for quadratic forms provide additional flexibility for easily analyzing random-effects models in nonstandard settings, which are becoming more important in modern applications (e.g., genomics). For instance, in addition to deriving novel non-asymptotic bounds for variance components estimators in classical linear random-effects models, we provide a concentration bound for variance components estimators in linear models with correlated random-effects and discuss an application involving sparse random-effects models. Our general concentration bound is a uniform version of the Hanson–Wright inequality. The main normal approximation result in the paper is derived using Reinert and Röllin [Ann. Probab. (2009) 37 2150–2173] embedding technique for Stein’s method of exchangeable pairs.

Article information

Source
Ann. Statist., Volume 45, Number 1 (2017), 386-414.

Dates
Received: September 2015
Revised: February 2016
First available in Project Euclid: 21 February 2017

Permanent link to this document
https://projecteuclid.org/euclid.aos/1487667627

Digital Object Identifier
doi:10.1214/16-AOS1456

Mathematical Reviews number (MathSciNet)
MR3611496

Zentralblatt MATH identifier
1364.62040

Subjects
Primary: 62F99: None of the above, but in this section
Secondary: 62E17: Approximations to distributions (nonasymptotic) 62F12: Asymptotic properties of estimators

Keywords
Hanson–Wright inequality random-effects models model misspecification Stein’s method uniform concentration bounds

Citation

Dicker, Lee H.; Erdogdu, Murat A. Flexible results for quadratic forms with applications to variance components estimation. Ann. Statist. 45 (2017), no. 1, 386--414. doi:10.1214/16-AOS1456. https://projecteuclid.org/euclid.aos/1487667627


Export citation

References

  • Adamczak, R. (2015). A note on the Hanson–Wright inequality for random vectors with dependencies. Electron. Commun. Probab. 20 no. 72, 13.
  • Bai, Z. D., Miao, B. and Yao, J.-F. (2003). Convergence rates of spectral distributions of large sample covariance matrices. SIAM J. Matrix Anal. Appl. 25 105–127.
  • Bai, Z. and Silverstein, J. W. (2010). Spectral Analysis of Large Dimensional Random Matrices, 2nd ed. Springer, New York.
  • Chatterjee, S. (2008). A new method of normal approximation. Ann. Probab. 36 1584–1610.
  • de Los Campos, G., Sorensen, D. and Gianola, D. (2015). Genomic heritability: What is it? PLoS Genet. 11 e1005048.
  • Demidenko, E. (2013). Mixed Models: Theory and Applications with R, 2nd ed. Wiley, Hoboken, NJ.
  • Dicker, L. H. and Erdogdu, M. A. (2016a). Maximum likelihood for variance estimation in high-dimensional linear models. In Proceedings of the 19th International Conference on Artificial Intelligence and Statistics 159–167. JMLR Workshop & Conference Proceedings.
  • Dicker, L. H. and Erdogdu, M. A. (2016b). Supplement to “Flexible results for quadratic forms with applications to variance components estimation.” DOI:10.1214/16-AOS1456SUPP.
  • Golan, D. and Rosset, S. (2011). Accurate estimation of heritability in genome wide studies using random effects models. Bioinformatics 27 i317–i323.
  • Götze, F. and Tikhomirov, A. N. (1999). Asymptotic distribution of quadratic forms. Ann. Probab. 27 1072–1098.
  • Götze, F. and Tikhomirov, A. (2002). Asymptotic distribution of quadratic forms and applications. J. Theoret. Probab. 15 423–475.
  • Hall, P. (1984). Central limit theorem for integrated square error of multivariate nonparametric density estimators. J. Multivariate Anal. 14 1–16.
  • Hanson, D. L. and Wright, F. T. (1971). A bound on tail probabilities for quadratic forms in independent random variables. Ann. Math. Stat. 42 1079–1083.
  • Hartley, H. O. and Rao, J. N. K. (1967). Maximum-likelihood estimation for the mixed analysis of variance model. Biometrika 54 93–108.
  • Harville, D. A. (1977). Maximum likelihood approaches to variance component estimation and to related problems. J. Amer. Statist. Assoc. 72 320–340.
  • Henderson, C. R. (1950). Abstract: Estimation of genetic parameters. Ann. Math. Stat. 21 309–310.
  • Hsu, D., Kakade, S. M. and Zhang, T. (2012). A tail inequality for quadratic forms of subgaussian random vectors. Electron. Commun. Probab. 17 no. 52, 6.
  • Jiang, J. (1996). REML estimation: Asymptotic behavior and related topics. Ann. Statist. 24 255–286.
  • Jiang, J. (1998). Asymptotic properties of the empirical BLUP and BLUE in mixed linear models. Statist. Sinica 8 861–885.
  • Jiang, J., Li, C., Paul, D., Yang, C. and Zhao, H. (2016). On high-dimensional genome-wide association study and misspecified mixed model analysis. Ann. Statist. 44 2127–2160.
  • Johnstone, I. M. (2001). On the distribution of the largest eigenvalue in principal components analysis. Ann. Statist. 29 295–327.
  • Lehmann, E. L. and Casella, G. (1998). Theory of Point Estimation, 2nd ed. Springer, New York.
  • Reinert, G. and Röllin, A. (2009). Multivariate normal approximation with Stein’s method of exchangeable pairs under a general linearity condition. Ann. Probab. 37 2150–2173.
  • Richardson, A. M. and Welsh, A. H. (1994). Asymptotic properties of restricted maximum likelihood (REML) estimates for hierarchical mixed linear models. Aust. J. Stat. 36 31–43.
  • Rudelson, M. and Vershynin, R. (2013). Hanson–Wright inequality and sub-Gaussian concentration. Electron. Commun. Probab. 18 no. 82, 9.
  • Searle, S. R., Casella, G. and McCulloch, C. E. (1992). Variance Components. Wiley, New York.
  • Sevastyanov, B. A. (1961). A class of limit distributions for quadratic forms of normal stochastic variables. Theory Probab. Appl. 6 337–340.
  • Speed, D., Hemani, G., Johnson, M. R. and Balding, D. J. (2012). Improved heritability estimation from genome-wide SNPs. Am. J. Hum. Genet. 91 1011–1021.
  • Van de Geer, S. A. (2000). Empirical Processes in M-Estimation. Cambridge Univ. Press, Cambridge.
  • van der Vaart, A. W. (1998). Asymptotic Statistics. Cambridge Series in Statistical and Probabilistic Mathematics 3. Cambridge Univ. Press, Cambridge.
  • Vershynin, R. (2012). Introduction to the non-asymptotic analysis of random matrices. In Compressed Sensing 210–268. Cambridge Univ. Press, Cambridge.
  • Whittle, P. (1964). On the convergence to normality of quadratic forms in independent variables. Teor. Veroyatn. Primen. 9 113–118.
  • Yang, J., Benyamin, B., McEvoy, B. P., Gordon, S., Henders, A. K., Nyholt, D. R., Madden, P. A., Heath, A. C., Martin, N. G., Montgomery, G. W., Goddard, M. E. and Visscher, P. M. (2010). Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 42 565–569.
  • Yang, J., Zaitlen, N. A., Goddard, M. E., Visscher, P. M. and Price, A. L. (2014). Advantages and pitfalls in the application of mixed-model association methods. Nat. Genet. 46 100–106.
  • Zaitlen, N. A. and Kraft, P. (2012). Heritability in the genome-wide association era. Hum. Genet. 131 1655–1664.

Supplemental materials

  • Supplement to “Flexible results for quadratic forms with applications to variance components estimation”. The supplementary document Dicker and Erdogdu (2016b) contains proofs of Propositions 1–4, along with statements and proofs of additional auxiliary results.