Electronic Journal of Statistics

Asymptotic properties of predictive recursion: Robustness and rate of convergence

Ryan Martin and Surya T. Tokdar

Full-text: Open access


Here we explore general asymptotic properties of Predictive Recursion (PR) for nonparametric estimation of mixing distributions. We prove that, when the mixture model is mis-specified, the estimated mixture converges almost surely in total variation to the mixture that minimizes the Kullback-Leibler divergence, and a bound on the (Hellinger contrast) rate of convergence is obtained. Simulations suggest that this rate is nearly sharp in a minimax sense. Moreover, when the model is identifiable, almost sure weak convergence of the mixing distribution estimate follows.

PR assumes that the support of the mixing distribution is known. To remove this requirement, we propose a generalization that incorporates a sequence of supports, increasing with the sample size, that combines the efficiency of PR with the flexibility of mixture sieves. Under mild conditions, we obtain a bound on the rate of convergence of these new estimates.

Article information

Electron. J. Statist., Volume 3 (2009), 1455-1472.

First available in Project Euclid: 24 December 2009

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G20: Asymptotic properties
Secondary: 62G05: Estimation 62G07: Density estimation 62G35: Robustness

Almost supermartingale density estimation empirical Bayes Kullback-Leibler projection mixture models


Martin, Ryan; Tokdar, Surya T. Asymptotic properties of predictive recursion: Robustness and rate of convergence. Electron. J. Statist. 3 (2009), 1455--1472. doi:10.1214/09-EJS458. https://projecteuclid.org/euclid.ejs/1261671305

Export citation


  • [1] Barron, A. R. (2000). Limits of information, Markov chains, and projection. In, IEEE International Symposium on Information Theory 25.
  • [2] Billingsley, P. (1995)., Probability and measure, Third ed. John Wiley & Sons Inc., New York.
  • [3] Bogdan, M., Ghosh, J. K. and Tokdar, S. T. (2008). A comparison of the Benjamini-Hochberg prodecure with some Bayesian Rules for Multiple Testing. In, Beyond Parametrics in Interdisciplinary Research: Festschrift in Honor of Professor Pranab K. Sen ( N. Balakrishnan, E. Peña and M. Silvapulle, eds.) 211–230. IMS, Beachwood, OH.
  • [4] Brown, L. D., George, E. I. and Xu, X. (2008). Admissible predictive density estimation., Ann. Statist. 36 1156–1170.
  • [5] Efron, B. (2008). Microarrays, empirical Bayes and the two-groups model., Statist. Sci. 23 1–22.
  • [6] Genovese, C. R. and Wasserman, L. (2000). Rates of convergence for the Gaussian mixture sieve., Ann. Statist. 28 1105–1127.
  • [7] Ghosal, S. and van der Vaart, A. W. (2001). Entropies and rates of convergence for maximum likelihood and Bayes estimation for mixtures of normal densities., Ann. Statist. 29 1233–1263.
  • [8] Ghosh, J. K. and Tokdar, S. T. (2006). Convergence and consistency of Newton’s algorithm for estimating mixing distribution. In, Frontiers in statistics 429–443. Imp. Coll. Press, London.
  • [9] Kemperman, J. H. B. (1969). On the optimum rate of transmitting information., Ann. Math. Statist. 40 2156–2177.
  • [10] Kleijn, B. J. K. and van der Vaart, A. W. (2006). Misspecification in infinite-dimensional Bayesian statistics., Ann. Statist. 34 837–877.
  • [11] Lai, T. L. (2003). Stochastic approximation., Ann. Statist. 31 391–406.
  • [12] Leroux, B. G. (1992). Consistent estimation of a mixing distribution., Ann. Statist. 20 1350–1360.
  • [13] Liese, F. and Vajda, I. (1987)., Convex statistical distances. Teubner, Leipzig.
  • [14] Li, J. Q. and Barron, A. R. (2000). Mixture density estimation. In, Advances in Neural Information Processing Systems ( S. Solla, T. Leen and K.-R. Mueller, eds.) 279–285. MIT Press, Cambridge, Massachusetts.
  • [15] Lindsay, B. G. (1983). The geometry of mixture likelihoods: a general theory., Ann. Statist. 11 86–94.
  • [16] Martin, R. and Ghosh, J. K. (2008). Stochastic approximation and Newton’s estimate of a mixing distribution., Statist. Sci. 23 365–382.
  • [17] Newton, M. A. (2002). On a nonparametric recursive estimator of the mixing distribution., Sankhyā Ser. A 64 306–322.
  • [18] Newton, M. A., Quintana, F. A. and Zhang, Y. (1998). Nonparametric Bayes methods using predictive updating. In, Practical nonparametric and semiparametric Bayesian statistics, 133 45–61. Springer, New York.
  • [19] Newton, M. A. and Zhang, Y. (1999). A recursive algorithm for nonparametric analysis with missing data., Biometrika 86 15–26.
  • [20] Patilea, V. (2001). Convex models, MLE and misspecification., Ann. Statist. 29 94–123.
  • [21] Pratt, J. W. (1960). On interchanging limits and integrals., Ann. Math. Statist. 31 74–77.
  • [22] Robbins, H. and Siegmund, D. (1971). A convergence theorem for non negative almost supermartingales and some applications. In, Optimizing methods in statistics (Proc. Sympos., Ohio State Univ., Columbus, Ohio, 1971) 233–257. Academic Press, New York.
  • [23] Tokdar, S. T., Martin, R. and Ghosh, J. K. (2009). Consistency of a recursive estimate of mixing distributions., Ann. Statist. 37 2502–2522.