August 2021 Estimation of the number of components of nonparametric multivariate finite mixture models
Caleb Kwon, Eric Mbakop
Author Affiliations +
Ann. Statist. 49(4): 2178-2205 (August 2021). DOI: 10.1214/20-AOS2032

Abstract

We propose a novel estimator for the number of mixture components (denoted by M) in a nonparametric finite mixture model. The setting that we consider is one where the analyst has repeated observations of K2 variables that are conditionally independent given a finitely supported latent variable with M support points. Under a mild assumption on the joint distribution of the observed and latent variables, we show that an integral operator T that is identified from the data has rank equal to M. We use this observation, in conjunction with the fact that singular values of operators are stable under perturbations, to propose an estimator of M, which essentially consists of a thresholding rule that counts the number of singular values of a consistent estimator of T that are greater than a data-driven threshold. We prove that our estimator of M is consistent, and establish nonasymptotic results, which provide finite sample performance guarantees for our estimator. We present a Monte Carlo study, which shows that our estimator performs well for samples of moderate size.

Acknowledgments

We thank Ivan Canay, Denis Chetverikov and Joel Horowitz for their helpful comments and suggestions. We would also like to thank the Editor, an Associate Editor and two anonymous referees for a careful reading of the manuscript and for comments that greatly improved this paper. We are also grateful to Hiro Kasahara and Katsumi Shimotsu for providing us with their code and data sets for the empirical section of this paper.

Citation

Download Citation

Caleb Kwon. Eric Mbakop. "Estimation of the number of components of nonparametric multivariate finite mixture models." Ann. Statist. 49 (4) 2178 - 2205, August 2021. https://doi.org/10.1214/20-AOS2032

Information

Received: 1 September 2019; Revised: 1 June 2020; Published: August 2021
First available in Project Euclid: 29 September 2021

MathSciNet: MR4319246
zbMATH: 1486.62085
Digital Object Identifier: 10.1214/20-AOS2032

Subjects:
Primary: 62G05
Secondary: 47A55 , 47G10 , 47N30 , 62G15 , 62H30

Keywords: Conditional independence , Finite mixture model , latent model , multivariate data , nonparametric mixture

Rights: Copyright © 2021 Institute of Mathematical Statistics

JOURNAL ARTICLE
28 PAGES

This article is only available to subscribers.
It is not available for individual sale.
+ SAVE TO MY LIBRARY

Vol.49 • No. 4 • August 2021
Back to Top