The Annals of Statistics

Strong identifiability and optimal minimax rates for finite mixture estimation

Philippe Heinrich and Jonas Kahn

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


We study the rates of estimation of finite mixing distributions, that is, the parameters of the mixture. We prove that under some regularity and strong identifiability conditions, around a given mixing distribution with $m_{0}$ components, the optimal local minimax rate of estimation of a mixing distribution with $m$ components is $n^{-1/(4(m-m_{0})+2)}$. This corrects a previous paper by Chen [Ann. Statist. 23 (1995) 221–233].

By contrast, it turns out that there are estimators with a (nonuniform) pointwise rate of estimation of $n^{-1/2}$ for all mixing distributions with a finite number of components.

Article information

Ann. Statist., Volume 46, Number 6A (2018), 2844-2870.

Received: July 2015
Revised: October 2017
First available in Project Euclid: 7 September 2018

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G05: Estimation
Secondary: 62G20: Asymptotic properties

Local asymptotic normality convergence of experiments maximum likelihood estimate Wasserstein metric mixing distribution mixture model rate of convergence strong identifiability pointwise rate superefficiency


Heinrich, Philippe; Kahn, Jonas. Strong identifiability and optimal minimax rates for finite mixture estimation. Ann. Statist. 46 (2018), no. 6A, 2844--2870. doi:10.1214/17-AOS1641.

Export citation


  • Bontemps, D. and Gadat, S. (2014). Bayesian methods for the shape invariant model. Electron. J. Stat. 8 1522–1568.
  • Caillerie, C., Chazal, F., Dedecker, J. and Michel, B. (2013). Deconvolution for the Wasserstein metric and geometric inference. In Geometric Science of Information 561–568. Springer, Berlin.
  • Chen, J. H. (1995). Optimal rate of convergence for finite mixture models. Ann. Statist. 23 221–233.
  • Dacunha-Castelle, D. and Gassiat, E. (1997). The estimation of the order of a mixture model. Bernoulli 3 279–299.
  • Dedecker, J. and Michel, B. (2013). Minimax rates of convergence for Wasserstein deconvolution with supersmooth errors in any dimension. J. Multivariate Anal. 122 278–291.
  • Deely, J. J. and Kruse, R. L. (1968). Construction of sequences estimating the mixing distribution. Ann. Math. Stat. 39 286–288.
  • Dudley, R. M. (2002). Real Analysis and Probability. Cambridge Studies in Advanced Mathematics 74. Cambridge Univ. Press, Cambridge.
  • Fan, J. (1991). On the optimal rates of convergence for nonparametric deconvolution problems. Ann. Statist. 19 1257–1272.
  • Gassiat, E. and van Handel, R. (2013). Consistent order estimation and minimal penalties. IEEE Trans. Inform. Theory 59 1115–1128.
  • Genovese, C. R. and Wasserman, L. (2000). Rates of convergence for the Gaussian mixture sieve. Ann. Statist. 28 1105–1127.
  • Ghosal, S. and van der Vaart, A. W. (2001). Entropies and rates of convergence for maximum likelihood and Bayes estimation for mixtures of normal densities. Ann. Statist. 29 1233–1263.
  • Hájek, J. (1972). Local asymptotic minimax and admissibility in estimation. In Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability (Univ. California, Berkeley, Calif., 1970/1971), Vol. I: Theory of Statistics 175–194. Univ. California Press, Berkeley, CA.
  • Heinrich, P. and Kahn, J. (2018). Supplement to “Strong identifiability and optimal minimax rates for finite mixture estimation.” DOI:10.1214/17-AOS1641SUPP.
  • Ho, N. and Nguyen, X. (2015). Identifiability and optimal rates of convergence for parameters of multiple types in finite mixtures. Preprint. Available at arXiv:1501.02497.
  • Ho, N. and Nguyen, X. (2016a). On strong identifiability and convergence rates of parameter estimation in finite mixtures. Electron. J. Stat. 10 271–307.
  • Ho, N. and Nguyen, X. (2016b). Convergence rates of parameter estimation for some weakly identifiable finite mixtures. Ann. Statist. 44 2726–2755.
  • Holzmann, H., Munk, A. and Stratmann, B. (2004). Identifiability of finite mixtures—with applications to circular distributions. Sankhyā 66 440–449.
  • Ishwaran, H., James, L. F. and Sun, J. (2001). Bayesian model selection in finite mixtures by marginal density decompositions. J. Amer. Statist. Assoc. 96 1316–1332.
  • Kuhn, M. A., Feigelson, E. D., Getman, K. V., Baddeley, A. J., Broos, P. S., Sills, A., Bate, M. R., Povich, M. S., Luhman, K. L., Busk, H. A. et al. (2014). The spatial structure of young stellar clusters. I. Subclusters. The Astrophysical Journal 787 107.
  • Le Cam, L. (1960). Locally asymptotically normal families of distributions. Certain approximations to families of distributions and their use in the theory of estimation and testing hypotheses. Univ. California Publ. Statist. 3 37–98.
  • Le Cam, L. (1986). Asymptotic Methods in Statistical Decision Theory. Springer, New York.
  • Lindsay, B. G. (1989). Moment matrices: Applications in mixtures. Ann. Statist. 17 722–740.
  • Liu, M. and Hancock, G. R. (2014). Unrestricted mixture models for class identification in growth mixture modeling. Educational and Psychological Measurement 74 557–584.
  • Martin, R. (2012). Convergence rate for predictive recursion estimation of finite mixtures. Statist. Probab. Lett. 82 378–384.
  • Massart, P. (1990). The tight constant in the Dvoretzky–Kiefer–Wolfowitz inequality. Ann. Probab. 18 1269–1283.
  • McLachlan, G. and Peel, D. (2000). Finite Mixture Models. Wiley, New York.
  • Nguyen, X. (2013). Convergence of latent mixing measures in finite and infinite mixture models. Ann. Statist. 41 370–400.
  • Pearson, K. (1894). Contributions to the theory of mathematical evolution. Philos. Trans. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 185 71–110.
  • Rousseau, J. and Mengersen, K. (2011). Asymptotic behaviour of the posterior distribution in overfitted mixture models. J. R. Stat. Soc. Ser. B. Stat. Methodol. 73 689–710.
  • Teh, Y. W. (2010). Dirichlet process. In Encyclopedia of Machine Learning (C. Sammut and G. Webb, eds.) 280–287. Springer, New York.
  • Titterington, D. M., Smith, A. F. M. and Makov, U. E. (1985). Statistical Analysis of Finite Mixture Distributions. Wiley, Chichester.
  • van de Geer, S. (1996). Rates of convergence for the maximum likelihood estimator in mixture models. J. Nonparametr. Stat. 6 293–310.
  • van der Vaart, A. W. (1998). Asymptotic Statistics. Cambridge Series in Statistical and Probabilistic Mathematics 3. Cambridge Univ. Press, Cambridge.
  • Yang, Y. (2005). Can the strengths of AIC and BIC be shared? A conflict between model indentification and regression estimation. Biometrika 92 937–950.
  • Zhu, H.-T. and Zhang, H. (2004). Hypothesis testing in mixture regression models. J. R. Stat. Soc. Ser. B. Stat. Methodol. 66 3–16.
  • Zhu, H. and Zhang, H. (2006). Asymptotics for estimation and testing procedures under loss of identifiability. J. Multivariate Anal. 97 19–45.

Supplemental materials

  • Auxiliary results and technical details. This supplemental part gathers some proof details on some assertions given in the paper.