Source: Ann. Statist. Volume 28, Number 4
(2000), 1105-1127.
Gaussian mixtures provide a convenient method of density
estimation that lies somewhere between parametric models and kernel density
estimators.When the number of components of the mixture is allowed to increase
as sample size increases, the model is called a mixture sieve.We establish a
bound on the rate of convergence in Hellinger distance for density estimation
using the Gaussian mixture sieve assuming that the true density is itself a
mixture of Gaussians; the underlying mixing measure of the true density is not
necessarily assumed to have finite support. Computing the rate involves some
delicate calculations since the size of the sieve—as measured by
bracketing entropy—and the saturation rate, cannot be found using
standard methods.When the mixing measure has compact support, using $k_n \sim
n^{2/3}/(\log n)^{1/3}$ components in the mixture yields a rate of order $(\log
n)^{(1+\eta)/6}/n^{1/6}$ for every $\eta > 0$. The rates depend heavilyon
the tail behavior of the true density.The sensitivity to the tail behavior is
dimin- ished byusing a robust sieve which includes a long-tailed component in
the mixture.In the compact case,we obtain an improved rate of $(\log
n/n)^{1/4}$. In the noncompact case, a spectrum of interesting rates arise
depending on the thickness of the tails of the mixing measure.
References
Banfield, J. and Raftery, A. (1993). Model-based Gaussian and non-Gaussian clustering. Biometrics 49 803-821.
Barron, A. and Yang, Y. (1995). An asymptotic property of model selection criteria. Technical report, Dept. Statistics, Yale Univ.
Chen, J. (1995). Optimal rate of convergence for finite mixtures models. Ann. Statist. 23 221-233.
Escobar, M. and West, M. (1995). Bayesian density estimation and inference using mixtures. J. Amer. Statist. Assoc. 90 577-588.
Gemen, S. and Hwang, C. (1982). Nonparametric maximum likelihood estimation bythe method of sieves. Ann. Statist. 10 401-414.
Mathematical Reviews (MathSciNet):
MR653512
Ghosal, S. and van der Vaart, A. (2000). Rates of convergence for Bayes and maximum likelihood estimation for mixtures of normal densities. Unpublished manuscript.
Grenander, U. (1981). Abstract Inference. Wiley, New York.
Hall, P. (1987). On Kullback-Leibler loss and densityestimation. Ann. Statist. 15 1491-1519.
Mathematical Reviews (MathSciNet):
MR913570
Li, J. (1999). Estimation of mixtures models. Ph.D. dissertation, Dept. Statistics. Yale Univ.
Li, J. and Barron, A. (1999). Mixture densityestimation. Preprint.
Lindsay, B. (1995). Mixture Models: Theory, Geometry and Applications. IMS, Hayward, CA.
McLachlan, G. and Basford, K. (1988). Mixture Models: Inference and Applications to Clustering. Dekker, New York.
Mathematical Reviews (MathSciNet):
MR926484
Priebe, C. (1994). Adaptive mixtures. J. Amer. Statist. Assoc. 89 796-806.
Robert, C. (1996). Mixtures of distributions: inference and estimation. In Markov Chain Monte Carlo in Practice (W. Gilks, S. Richardson, D. Spiegelhalter, eds.) 441-464. Chapman and Hall, London.
Roeder, K. and Wasserman, L. (1997). Practical Bayesian density estimation using mixtures of normals. J. Amer. Statist. Assoc. 92 894-902.
Roeder, K. (1992). Semiparametric estimation of normal mixture densities. Ann. Statist. 20 929-943.
Tong, B., and Viele, K. (1998). Mixtures of normal linear regressions. Technical report, Univ. Kentucky.
van de Geer, S. (1996). Rates of convergence for the maximum likelihood estimator in mixture models. Nonparametric Statist. 6 293-310.
van der Vaart, A. and Wellner, J. (1996). Weak Convergence and Empirical Processes. Springer, New York.
Wong, W. and Shen, X. (1995). Probabilityinequalities for likelihood ratios and convergence rates of sieve MLEs. Ann. Statist. 23 339-362.