Open Access
2022 Statistical inference for normal mixtures with unknown number of components
Mian Huang, Shiyi Tang, Weixin Yao
Author Affiliations +
Electron. J. Statist. 16(2): 5149-5181 (2022). DOI: 10.1214/22-EJS2061


Statistical inference for normal mixture models with unknown number of components has long been challenging due to the issues of nonidentifiability, degenerated Fisher matrix, and boundary parameters. In this paper, a penalized likelihood estimation procedure is proposed for mixtures of normals with unknown number of components to achieve both the order selection consistency and the root-n convergence rate for the component parameters estimators. We show that the proposed new estimator could avoid being trapped in certain degenerated regions of the nonidentifiable subset of the parameter space for over-fitted normal mixture models so that a regular asymptotic quadratic Taylor expansion of the mixture log-likelihood could be derived. With a suitable penalty function on mixing proportions, the new estimator is proved to be consistent on the order selection, and have an asymptotic normal distribution. Our derived sparsity conditions also reveal some surprising but interesting differences among some commonly used penalty functions and explain why the performance of some popularly used penalty functions, such as Lasso and SCAD, provide unsatisfactory results in the order selection. Extensive simulations and a real data analysis are conducted to demonstrate the effectiveness of the newly proposed estimator.

Funding Statement

Yao’s research is supported in part by NSF grant DMS-2210272.


The authors are grateful to Jiahua Chen for helpful comments and discussions. The authors also thank the editors and the referees for their insightful comments and suggestions, which greatly improved this article.


Download Citation

Mian Huang. Shiyi Tang. Weixin Yao. "Statistical inference for normal mixtures with unknown number of components." Electron. J. Statist. 16 (2) 5149 - 5181, 2022.


Received: 1 May 2021; Published: 2022
First available in Project Euclid: 6 October 2022

MathSciNet: MR4492987
zbMATH: 07603105
Digital Object Identifier: 10.1214/22-EJS2061

Keywords: EM algorithm , Normal mixture model , order selection , penalized estimation

Vol.16 • No. 2 • 2022
Back to Top