Nonparametric estimation of component distributions in a multivariate mixture

Peter Hall; Xiao-Hua Zhou

doi:10.1214/aos/1046294462

Februrary 2003 Nonparametric estimation of component distributions in a multivariate mixture

Peter Hall, Xiao-Hua Zhou

Ann. Statist. 31(1): 201-224 (Februrary 2003). DOI: 10.1214/aos/1046294462

Abstract

Suppose k-variate data are drawn from a mixture of two distributions, each having independent components. It is desired to estimate the univariate marginal distributions in each of the products, as well as the mixing proportion. This is the setting of two-class, fully parametrized latent models that has been proposed for estimating the distributions of medical test results when disease status is unavailable. The problem is one of inference in a mixture of distributions without training data, and until now it has been tackled only in a fully parametric setting. We investigate the possibility of using nonparametric methods. Of course, when k=1 the problem is not identifiable from a nonparametric viewpoint. We show that the problem is "almost" identifiable when k=2; there, the set of all possible representations can be expressed, in terms of any one of those representations, as a two-parameter family. Furthermore, it is proved that when $k\geq3$ the problem is nonparametrically identifiable under particularly mild regularity conditions. In this case we introduce root-n consistent nonparametric estimators of the 2k univariate marginal distributions and the mixing proportion. Finite-sample and asymptotic properties of the estimators are described.

Citation

Download Citation

Peter Hall. Xiao-Hua Zhou. "Nonparametric estimation of component distributions in a multivariate mixture." Ann. Statist. 31 (1) 201 - 224, Februrary 2003. https://doi.org/10.1214/aos/1046294462

Information

Published: Februrary 2003

First available in Project Euclid: 26 February 2003

zbMATH: 1018.62021

MathSciNet: MR1962504

Digital Object Identifier: 10.1214/aos/1046294462

Subjects:

Primary: 62G05

Secondary: 62G70

Keywords: biased bootstrap , Distribution estimation , empirical likelihood , Identification , latent model , Multivariate analysis , Nonparametric maximum likelihood , root-$n$ consistency

Access the abstract

JOURNAL ARTICLE
24 PAGES

DOWNLOAD PDF + SAVE TO MY LIBRARY