Bernoulli

  • Bernoulli
  • Volume 14, Number 4 (2008), 1089-1107.

Mixing least-squares estimators when the variance is unknown

Christophe Giraud

Full-text: Open access

Abstract

We propose a procedure to handle the problem of Gaussian regression when the variance is unknown. We mix least-squares estimators from various models according to a procedure inspired by that of Leung and Barron [IEEE Trans. Inform. Theory 52 (2006) 3396–3410]. We show that in some cases, the resulting estimator is a simple shrinkage estimator. We then apply this procedure to perform adaptive estimation in Besov spaces. Our results provide non-asymptotic risk bounds for the Euclidean risk of the estimator.

Article information

Source
Bernoulli, Volume 14, Number 4 (2008), 1089-1107.

Dates
First available in Project Euclid: 6 November 2008

Permanent link to this document
https://projecteuclid.org/euclid.bj/1225980572

Digital Object Identifier
doi:10.3150/08-BEJ135

Mathematical Reviews number (MathSciNet)
MR2543587

Zentralblatt MATH identifier
1168.62327

Keywords
adaptive minimax estimation Gibbs mixture linear regression oracle inequalities shrinkage estimator

Citation

Giraud, Christophe. Mixing least-squares estimators when the variance is unknown. Bernoulli 14 (2008), no. 4, 1089--1107. doi:10.3150/08-BEJ135. https://projecteuclid.org/euclid.bj/1225980572


Export citation

References

  • [1] Akaike, H. (1969). Statistical predictor identification., Ann. Inst. Statist. Math. 22 203–217.
  • [2] Baraud, Y., Giraud, C. and Huet, S. (2008). Gaussian model selection with unknown variance., Ann. Statist. To appear. arXiv:math/0701250v1.
  • [3] Barron, A. (1987)., Are Bayesian Rules Consistent in Information? New York: Springer.
  • [4] Barron, A., Birgé, L. and Massart, P. (1999). Risk bounds for model selection via penalization., Probab. Theory Related Fields 113 301–413.
  • [5] Barron, A. and Cover, T. (1991). Minimum complexity density estimation., IEEE Trans. Inform. Theory 37 1034–1054.
  • [6] Birgé, L. and Massart, P. (2000). An adaptive compression algorithm in Besov spaces., Constr. Approx. 16 1–36.
  • [7] Birgé, L. and Massart, P. (2001). Gaussian model selection., J. Eur. Math. Soc. (JEMS) 3 203–268.
  • [8] Birgé, L. and Massart, P. (2007). Minimal penalties for Gaussian model selection., Probab. Theory Related Fields 138 33–73.
  • [9] Bunea, F., Tsybakov, A. and Wegkamp, M. (2007). Aggregation for Gaussian regression., Ann. Statist. 35 1674–1697.
  • [10] Catoni, O. (1997). Mixture approach to universal model selection. Preprint 30, Laboratoire de l’Ecole Normale Supérieure, Paris.
  • [11] Catoni, O. (1999). Universal aggregation rules with exact bias bounds. Preprint 510, Laboratoire de Probabilités et Modèles Aléatoires, CNRS, Paris.
  • [12] Devore, R. and Lorentz, G. (1993)., Constructive Approximation. New York: Springer.
  • [13] Donoho, D. and Johnstone, I. (1994). Ideal spatial adaptation by wavelet shrinkage., Biometrika 81 425–455.
  • [14] Giraud, C. (2007). Mixing least-squares estimators when the variance is unknown. Technical report., arXiv:0711.0372v1.
  • [15] Hall, P., Kay, J. and Titterington, D.M. (1990). Asymptotically optimal differencebased estimation of variance in nonparametric regression., Biometrika 77 521–528.
  • [16] Hartigan, J.A. (2002). Bayesian regression using Akaike priors. Preprint, Yale Univ., New, Haven.
  • [17] Lenth, R. (1989). Quick and easy analysis of unreplicated factorials., Technometrics 31 469–473.
  • [18] Leung, G. and Barron, A. (2006). Information theory and mixing least-squares regressions., IEEE Trans. Inform. Theory 52 3396–3410.
  • [19] Mallows, C. (1973). Some comments on, Cp. Technometrics 15 661–675.
  • [20] Munk, A., Bissantz, N., Wagner, T. and Freitag, G. (2005). On difference based variance estimation in nonparametric regression when the covariate is high dimensional., J. Roy. Statist. Soc. Ser. B 67 19–41.
  • [21] Rice, J. (1984). Bandwidth choice for nonparametric kernel regression., Ann. Statist. 12 1215–1230.
  • [22] Tong, T. and Wang, Y. (2005). Estimating residual variance in nonparametric regression using least squares., Biometrika 92 821–830.
  • [23] Tsybakov, A. (2003). Optimal rates of aggregation., COLT-2003. Lecture Notes in Artificial Intelligence 2777 303–313. Heidelberg: Springer.
  • [24] Yang, Y. and Barron, A. (1999). Information-theoretic determination of minimax rates of convergence., Ann. Statist. 27 1564–1599.
  • [25] Yang, Y. (2000). Combining different procedures for adaptive regression., J. Multivariate Anal. 74 135–161.
  • [26] Yang, Y. (2000). Mixing strategies for density estimation., Ann. Statist. 28 75–87.
  • [27] Yang, Y. (2004). Combining forecasting procedures: Some theoretical results., Econometric Theory 20 176–222.
  • [28] Wang, L., Brown, L., Cai, T. and Levine, M. (2008). Effect of mean on variance function estimation in nonparametric regression., Ann. Statist. 36 646–664.