## Bernoulli

• Bernoulli
• Volume 26, Number 2 (2020), 828-857.

### Robust estimation of mixing measures in finite mixture models

#### Abstract

In finite mixture models, apart from underlying mixing measure, true kernel density function of each subpopulation in the data is, in many scenarios, unknown. Perhaps the most popular approach is to choose some kernel functions that we empirically believe our data are generated from and use these kernels to fit our models. Nevertheless, as long as the chosen kernel and the true kernel are different, statistical inference of mixing measure under this setting will be highly unstable. To overcome this challenge, we propose flexible and efficient robust estimators of the mixing measure in these models, which are inspired by the idea of minimum Hellinger distance estimator, model selection criteria, and superefficiency phenomenon. We demonstrate that our estimators consistently recover the true number of components and achieve the optimal convergence rates of parameter estimation under both the well- and misspecified kernel settings for any fixed bandwidth. These desirable asymptotic properties are illustrated via careful simulation studies with both synthetic and real data.

#### Article information

Source
Bernoulli, Volume 26, Number 2 (2020), 828-857.

Dates
Revised: October 2018
First available in Project Euclid: 31 January 2020

https://projecteuclid.org/euclid.bj/1580461565

Digital Object Identifier
doi:10.3150/18-BEJ1087

Mathematical Reviews number (MathSciNet)
MR4058353

Zentralblatt MATH identifier
07166549

#### Citation

Ho, Nhat; Nguyen, XuanLong; Ritov, Ya’acov. Robust estimation of mixing measures in finite mixture models. Bernoulli 26 (2020), no. 2, 828--857. doi:10.3150/18-BEJ1087. https://projecteuclid.org/euclid.bj/1580461565

#### References

• [1] Azzalini, A. and Dalla Valle, A. (1996). The multivariate skew-normal distribution. Biometrika 83 715–726.
• [2] Beran, R. (1977). Minimum Hellinger distance estimates for parametric models. Ann. Statist. 5 445–463.
• [3] Bordes, L., Mottelet, S. and Vandekerkhove, P. (2006). Semiparametric estimation of a two-component mixture model. Ann. Statist. 34 1204–1232.
• [4] Chen, J. and Khalili, A. (2008). Order selection in finite mixture models with a nonsmooth penalty. J. Amer. Statist. Assoc. 103 1674–1683.
• [5] Chen, J., Li, P. and Fu, Y. (2012). Inference on the order of a normal mixture. J. Amer. Statist. Assoc. 107 1096–1105.
• [6] Chen, J.H. (1995). Optimal rate of convergence for finite mixture models. Ann. Statist. 23 221–233.
• [7] Cutler, A. and Cordero-Braña, O.I. (1996). Minimum Hellinger distance estimation for finite mixture models. J. Amer. Statist. Assoc. 91 1716–1723.
• [8] Dacunha-Castelle, D. and Gassiat, E. (1997). The estimation of the order of a mixture model. Bernoulli 3 279–299.
• [9] Dacunha-Castelle, D. and Gassiat, E. (1999). Testing the order of a model using locally conic parametrization: Population mixtures and stationary ARMA processes. Ann. Statist. 27 1178–1209.
• [10] Donoho, D.L. and Liu, R.C. (1988). The “automatic” robustness of minimum distance functionals. Ann. Statist. 16 552–586.
• [11] Dudley, C.R.K., Giuffra, L.A., Raine, A.E.G. and Reeders, S.T. (1991). Assessing the role of APNH, a gene encoding for a human amiloride-sensitive Na$/H$ antiporter, on the interindividual variation in red cell Na$/Li$ countertransport. J. Am. Soc. Nephrol. 2 937–943.
• [12] Escobar, M.D. and West, M. (1995). Bayesian density estimation and inference using mixtures. J. Amer. Statist. Assoc. 90 577–588.
• [13] Heinrich, P. and Kahn, J. (2018). Strong identifiability and optimal minimax rates for finite mixture estimation. Ann. Statist. 46 2844–2870.
• [14] Ho, N. and Nguyen, X. (2016). Convergence rates of parameter estimation for some weakly identifiable finite mixtures. Ann. Statist. 44 2726–2755.
• [15] Ho, N. and Nguyen, X. (2016). Singularity structures and impacts on parameter estimation in finite mixtures of distributions. Available at arXiv:1609.02655.
• [16] Ho, N. and Nguyen, X. (2016). On strong identifiability and convergence rates of parameter estimation in finite mixtures. Electron. J. Stat. 10 271–307.
• [17] Ho, N., Nguyen, X. and Ritov, Y. (2020). Supplement to “Robust estimation of mixing measures in finite mixture models.” https://doi.org/10.3150/18-BEJ1087SUPP.
• [18] Hunter, D.R., Wang, S. and Hettmansperger, T.P. (2007). Inference for mixtures of symmetric distributions. Ann. Statist. 35 224–251.
• [19] Ishwaran, H., James, L.F. and Sun, J. (2001). Bayesian model selection in finite mixtures by marginal density decompositions. J. Amer. Statist. Assoc. 96 1316–1332.
• [20] James, L.F., Priebe, C.E. and Marchette, D.J. (2001). Consistent estimation of mixture complexity. Ann. Statist. 29 1281–1296.
• [21] Johannes, J. (2009). Deconvolution with unknown error distribution. Ann. Statist. 37 2301–2323.
• [22] Karunamuni, R.J. and Wu, J. (2009). Minimum Hellinger distance estimation in a nonparametric mixture model. J. Statist. Plann. Inference 139 1118–1133.
• [23] Kasahara, H. and Shimotsu, K. (2014). Non-parametric identification and estimation of the number of components in multivariate mixtures. J. R. Stat. Soc. Ser. B. Stat. Methodol. 76 97–111.
• [24] Keribin, C. (2000). Consistent estimation of the order of mixture models. Sankhya, Ser. A 62 49–66.
• [25] Lin, N. and He, X. (2006). Robust and efficient estimation under data grouping. Biometrika 93 99–112.
• [26] Lindsay, B.G. (1995). Mixture models: Theory, geometry and applications. In NSF-CBMS Regional Conference Series in Probability and Statistics. Hayward, CA: Institute of Mathematical Statistics.
• [27] Lindsay, B.G. (1994). Efficiency versus robustness: The case for minimum Hellinger distance and related methods. Ann. Statist. 22 1081–1114.
• [28] McLachlan, G. and Peel, D. (2000). Finite Mixture Models. Wiley Series in Probability and Statistics: Applied Probability and Statistics. New York: Wiley Interscience.
• [29] McLachlan, G.J. and Basford, K.E. (1988). Mixture Models: Inference and Applications to Clustering. Statistics: Textbooks and Monographs 84. New York: Dekker.
• [30] Miller, J. and Dunson, D. Robust Bayesian inference via coarsening. J. Amer. Statist. Assoc. To appear.
• [31] Nguyen, X. (2013). Convergence of latent mixing measures in finite and infinite mixture models. Ann. Statist. 41 370–400.
• [32] Pearson, K. (1894). Contributions to the theory of mathematical evolution. Philos. Trans. R. Soc. Lond. Ser. A 185 71–110.
• [33] Richardson, S. and Green, P.J. (1997). On Bayesian analysis of mixtures with an unknown number of components. J. Roy. Statist. Soc. Ser. B 59 731–792.
• [34] Roeder, K. (1994). A graphical technique for determining the number of components in a mixture of normals. J. Amer. Statist. Assoc. 89 487–495.
• [35] Teicher, H. (1961). Identifiability of mixtures. Ann. Math. Stat. 32 244–248.
• [36] Villani, C. (2009). Optimal Transport: Old and New. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences] 338. Berlin: Springer.
• [37] Wiper, M., Rios Insua, D. and Ruggeri, F. (2001). Mixtures of gamma distributions with applications. J. Comput. Graph. Statist. 10 440–454.
• [38] Woo, M.-J. and Sriram, T.N. (2006). Robust estimation of mixture complexity. J. Amer. Statist. Assoc. 101 1475–1486.
• [39] Woodward, W.A., Parr, W.C., Schucany, W.R. and Lindsey, H. (1984). A comparison of minimum distance and maximum likelihood estimation of a mixture proportion. J. Amer. Statist. Assoc. 79 590–598.

#### Supplemental materials

• Supplement to “Robust estimation of mixing measures in finite mixture models”. In this supplemental material, we provide self-contained proofs of several key results in the paper.