## The Annals of Statistics

### Strong identifiability and optimal minimax rates for finite mixture estimation

#### Abstract

We study the rates of estimation of finite mixing distributions, that is, the parameters of the mixture. We prove that under some regularity and strong identifiability conditions, around a given mixing distribution with $m_{0}$ components, the optimal local minimax rate of estimation of a mixing distribution with $m$ components is $n^{-1/(4(m-m_{0})+2)}$. This corrects a previous paper by Chen [Ann. Statist. 23 (1995) 221–233].

By contrast, it turns out that there are estimators with a (nonuniform) pointwise rate of estimation of $n^{-1/2}$ for all mixing distributions with a finite number of components.

#### Article information

Source
Ann. Statist., Volume 46, Number 6A (2018), 2844-2870.

Dates
Revised: October 2017
First available in Project Euclid: 7 September 2018

https://projecteuclid.org/euclid.aos/1536307235

Digital Object Identifier
doi:10.1214/17-AOS1641

Mathematical Reviews number (MathSciNet)
MR3851757

Zentralblatt MATH identifier
06968601

Subjects
Primary: 62G05: Estimation
Secondary: 62G20: Asymptotic properties

#### Citation

Heinrich, Philippe; Kahn, Jonas. Strong identifiability and optimal minimax rates for finite mixture estimation. Ann. Statist. 46 (2018), no. 6A, 2844--2870. doi:10.1214/17-AOS1641. https://projecteuclid.org/euclid.aos/1536307235

#### References

• Bontemps, D. and Gadat, S. (2014). Bayesian methods for the shape invariant model. Electron. J. Stat. 8 1522–1568.
• Caillerie, C., Chazal, F., Dedecker, J. and Michel, B. (2013). Deconvolution for the Wasserstein metric and geometric inference. In Geometric Science of Information 561–568. Springer, Berlin.
• Chen, J. H. (1995). Optimal rate of convergence for finite mixture models. Ann. Statist. 23 221–233.
• Dacunha-Castelle, D. and Gassiat, E. (1997). The estimation of the order of a mixture model. Bernoulli 3 279–299.
• Dedecker, J. and Michel, B. (2013). Minimax rates of convergence for Wasserstein deconvolution with supersmooth errors in any dimension. J. Multivariate Anal. 122 278–291.
• Deely, J. J. and Kruse, R. L. (1968). Construction of sequences estimating the mixing distribution. Ann. Math. Stat. 39 286–288.
• Dudley, R. M. (2002). Real Analysis and Probability. Cambridge Studies in Advanced Mathematics 74. Cambridge Univ. Press, Cambridge.
• Fan, J. (1991). On the optimal rates of convergence for nonparametric deconvolution problems. Ann. Statist. 19 1257–1272.
• Gassiat, E. and van Handel, R. (2013). Consistent order estimation and minimal penalties. IEEE Trans. Inform. Theory 59 1115–1128.
• Genovese, C. R. and Wasserman, L. (2000). Rates of convergence for the Gaussian mixture sieve. Ann. Statist. 28 1105–1127.
• Ghosal, S. and van der Vaart, A. W. (2001). Entropies and rates of convergence for maximum likelihood and Bayes estimation for mixtures of normal densities. Ann. Statist. 29 1233–1263.
• Hájek, J. (1972). Local asymptotic minimax and admissibility in estimation. In Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability (Univ. California, Berkeley, Calif., 1970/1971), Vol. I: Theory of Statistics 175–194. Univ. California Press, Berkeley, CA.
• Heinrich, P. and Kahn, J. (2018). Supplement to “Strong identifiability and optimal minimax rates for finite mixture estimation.” DOI:10.1214/17-AOS1641SUPP.
• Ho, N. and Nguyen, X. (2015). Identifiability and optimal rates of convergence for parameters of multiple types in finite mixtures. Preprint. Available at arXiv:1501.02497.
• Ho, N. and Nguyen, X. (2016a). On strong identifiability and convergence rates of parameter estimation in finite mixtures. Electron. J. Stat. 10 271–307.
• Ho, N. and Nguyen, X. (2016b). Convergence rates of parameter estimation for some weakly identifiable finite mixtures. Ann. Statist. 44 2726–2755.
• Holzmann, H., Munk, A. and Stratmann, B. (2004). Identifiability of finite mixtures—with applications to circular distributions. Sankhyā 66 440–449.
• Ishwaran, H., James, L. F. and Sun, J. (2001). Bayesian model selection in finite mixtures by marginal density decompositions. J. Amer. Statist. Assoc. 96 1316–1332.
• Kuhn, M. A., Feigelson, E. D., Getman, K. V., Baddeley, A. J., Broos, P. S., Sills, A., Bate, M. R., Povich, M. S., Luhman, K. L., Busk, H. A. et al. (2014). The spatial structure of young stellar clusters. I. Subclusters. The Astrophysical Journal 787 107.
• Le Cam, L. (1960). Locally asymptotically normal families of distributions. Certain approximations to families of distributions and their use in the theory of estimation and testing hypotheses. Univ. California Publ. Statist. 3 37–98.
• Le Cam, L. (1986). Asymptotic Methods in Statistical Decision Theory. Springer, New York.
• Lindsay, B. G. (1989). Moment matrices: Applications in mixtures. Ann. Statist. 17 722–740.
• Liu, M. and Hancock, G. R. (2014). Unrestricted mixture models for class identification in growth mixture modeling. Educational and Psychological Measurement 74 557–584.
• Martin, R. (2012). Convergence rate for predictive recursion estimation of finite mixtures. Statist. Probab. Lett. 82 378–384.
• Massart, P. (1990). The tight constant in the Dvoretzky–Kiefer–Wolfowitz inequality. Ann. Probab. 18 1269–1283.
• McLachlan, G. and Peel, D. (2000). Finite Mixture Models. Wiley, New York.
• Nguyen, X. (2013). Convergence of latent mixing measures in finite and infinite mixture models. Ann. Statist. 41 370–400.
• Pearson, K. (1894). Contributions to the theory of mathematical evolution. Philos. Trans. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 185 71–110.
• Rousseau, J. and Mengersen, K. (2011). Asymptotic behaviour of the posterior distribution in overfitted mixture models. J. R. Stat. Soc. Ser. B. Stat. Methodol. 73 689–710.
• Teh, Y. W. (2010). Dirichlet process. In Encyclopedia of Machine Learning (C. Sammut and G. Webb, eds.) 280–287. Springer, New York.
• Titterington, D. M., Smith, A. F. M. and Makov, U. E. (1985). Statistical Analysis of Finite Mixture Distributions. Wiley, Chichester.
• van de Geer, S. (1996). Rates of convergence for the maximum likelihood estimator in mixture models. J. Nonparametr. Stat. 6 293–310.
• van der Vaart, A. W. (1998). Asymptotic Statistics. Cambridge Series in Statistical and Probabilistic Mathematics 3. Cambridge Univ. Press, Cambridge.
• Yang, Y. (2005). Can the strengths of AIC and BIC be shared? A conflict between model indentification and regression estimation. Biometrika 92 937–950.
• Zhu, H.-T. and Zhang, H. (2004). Hypothesis testing in mixture regression models. J. R. Stat. Soc. Ser. B. Stat. Methodol. 66 3–16.
• Zhu, H. and Zhang, H. (2006). Asymptotics for estimation and testing procedures under loss of identifiability. J. Multivariate Anal. 97 19–45.

#### Supplemental materials

• Auxiliary results and technical details. This supplemental part gathers some proof details on some assertions given in the paper.