Electronic Journal of Statistics

Asymptotic properties of predictive recursion: Robustness and rate of convergence

Ryan Martin and Surya T. Tokdar

Full-text: Open access

Abstract

Here we explore general asymptotic properties of Predictive Recursion (PR) for nonparametric estimation of mixing distributions. We prove that, when the mixture model is mis-specified, the estimated mixture converges almost surely in total variation to the mixture that minimizes the Kullback-Leibler divergence, and a bound on the (Hellinger contrast) rate of convergence is obtained. Simulations suggest that this rate is nearly sharp in a minimax sense. Moreover, when the model is identifiable, almost sure weak convergence of the mixing distribution estimate follows.

PR assumes that the support of the mixing distribution is known. To remove this requirement, we propose a generalization that incorporates a sequence of supports, increasing with the sample size, that combines the efficiency of PR with the flexibility of mixture sieves. Under mild conditions, we obtain a bound on the rate of convergence of these new estimates.

Article information

Source
Electron. J. Statist., Volume 3 (2009), 1455-1472.

Dates
First available in Project Euclid: 24 December 2009

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1261671305

Digital Object Identifier
doi:10.1214/09-EJS458

Mathematical Reviews number (MathSciNet)
MR2578833

Zentralblatt MATH identifier
1326.62107

Subjects
Primary: 62G20: Asymptotic properties
Secondary: 62G05: Estimation 62G07: Density estimation 62G35: Robustness

Keywords
Almost supermartingale density estimation empirical Bayes Kullback-Leibler projection mixture models

Citation

Martin, Ryan; Tokdar, Surya T. Asymptotic properties of predictive recursion: Robustness and rate of convergence. Electron. J. Statist. 3 (2009), 1455--1472. doi:10.1214/09-EJS458. https://projecteuclid.org/euclid.ejs/1261671305


Export citation

References

  • [1] Barron, A. R. (2000). Limits of information, Markov chains, and projection. In, IEEE International Symposium on Information Theory 25.
  • [2] Billingsley, P. (1995)., Probability and measure, Third ed. John Wiley & Sons Inc., New York.
  • [3] Bogdan, M., Ghosh, J. K. and Tokdar, S. T. (2008). A comparison of the Benjamini-Hochberg prodecure with some Bayesian Rules for Multiple Testing. In, Beyond Parametrics in Interdisciplinary Research: Festschrift in Honor of Professor Pranab K. Sen ( N. Balakrishnan, E. Peña and M. Silvapulle, eds.) 211–230. IMS, Beachwood, OH.
  • [4] Brown, L. D., George, E. I. and Xu, X. (2008). Admissible predictive density estimation., Ann. Statist. 36 1156–1170.
  • [5] Efron, B. (2008). Microarrays, empirical Bayes and the two-groups model., Statist. Sci. 23 1–22.
  • [6] Genovese, C. R. and Wasserman, L. (2000). Rates of convergence for the Gaussian mixture sieve., Ann. Statist. 28 1105–1127.
  • [7] Ghosal, S. and van der Vaart, A. W. (2001). Entropies and rates of convergence for maximum likelihood and Bayes estimation for mixtures of normal densities., Ann. Statist. 29 1233–1263.
  • [8] Ghosh, J. K. and Tokdar, S. T. (2006). Convergence and consistency of Newton’s algorithm for estimating mixing distribution. In, Frontiers in statistics 429–443. Imp. Coll. Press, London.
  • [9] Kemperman, J. H. B. (1969). On the optimum rate of transmitting information., Ann. Math. Statist. 40 2156–2177.
  • [10] Kleijn, B. J. K. and van der Vaart, A. W. (2006). Misspecification in infinite-dimensional Bayesian statistics., Ann. Statist. 34 837–877.
  • [11] Lai, T. L. (2003). Stochastic approximation., Ann. Statist. 31 391–406.
  • [12] Leroux, B. G. (1992). Consistent estimation of a mixing distribution., Ann. Statist. 20 1350–1360.
  • [13] Liese, F. and Vajda, I. (1987)., Convex statistical distances. Teubner, Leipzig.
  • [14] Li, J. Q. and Barron, A. R. (2000). Mixture density estimation. In, Advances in Neural Information Processing Systems ( S. Solla, T. Leen and K.-R. Mueller, eds.) 279–285. MIT Press, Cambridge, Massachusetts.
  • [15] Lindsay, B. G. (1983). The geometry of mixture likelihoods: a general theory., Ann. Statist. 11 86–94.
  • [16] Martin, R. and Ghosh, J. K. (2008). Stochastic approximation and Newton’s estimate of a mixing distribution., Statist. Sci. 23 365–382.
  • [17] Newton, M. A. (2002). On a nonparametric recursive estimator of the mixing distribution., Sankhyā Ser. A 64 306–322.
  • [18] Newton, M. A., Quintana, F. A. and Zhang, Y. (1998). Nonparametric Bayes methods using predictive updating. In, Practical nonparametric and semiparametric Bayesian statistics, 133 45–61. Springer, New York.
  • [19] Newton, M. A. and Zhang, Y. (1999). A recursive algorithm for nonparametric analysis with missing data., Biometrika 86 15–26.
  • [20] Patilea, V. (2001). Convex models, MLE and misspecification., Ann. Statist. 29 94–123.
  • [21] Pratt, J. W. (1960). On interchanging limits and integrals., Ann. Math. Statist. 31 74–77.
  • [22] Robbins, H. and Siegmund, D. (1971). A convergence theorem for non negative almost supermartingales and some applications. In, Optimizing methods in statistics (Proc. Sympos., Ohio State Univ., Columbus, Ohio, 1971) 233–257. Academic Press, New York.
  • [23] Tokdar, S. T., Martin, R. and Ghosh, J. K. (2009). Consistency of a recursive estimate of mixing distributions., Ann. Statist. 37 2502–2522.