August 2022 Sharp optimal recovery in the two component Gaussian mixture model
Mohamed Ndaoud
Author Affiliations +
Ann. Statist. 50(4): 2096-2126 (August 2022). DOI: 10.1214/22-AOS2178

Abstract

In this paper, we study the problem of clustering in the Two component Gaussian mixture model when the centers are separated by some Δ>0. We present a nonasymptotic lower bound for the corresponding minimax Hamming risk improving on existing results. We also propose an optimal, efficient and adaptive procedure that is minimax rate optimal. The rate optimality is moreover sharp in the asymptotics when the sample size goes to infinity. Our procedure is based on a variant of Lloyd’s iterations initialized by a spectral method. As a consequence of nonasymptotic results, we find a sharp phase transition for the problem of exact recovery in the Gaussian mixture model. We prove that the phase transition occurs around the critical threshold Δ¯ given by

Δ¯2=σ2(1+1+2pnlogn)logn.

Funding Statement

This work was funded by CY Initiative of Excellence Paris Seine. The first version of this work (Ndaoud 2019, arXiv) was supported by the National Science Foundation grant CIF-1908905.

Acknowledgments

I would like to thank Christophe Giraud and Alexandre Tsybakov for stimulating discussions on clustering in Gaussian mixtures. I also would like to thank Pierre Bellec for helpful discussions that improved the presentation of this paper. I thank the anonymous reviewers for their insightful comments.

Citation

Download Citation

Mohamed Ndaoud. "Sharp optimal recovery in the two component Gaussian mixture model." Ann. Statist. 50 (4) 2096 - 2126, August 2022. https://doi.org/10.1214/22-AOS2178

Information

Received: 1 July 2020; Revised: 1 October 2021; Published: August 2022
First available in Project Euclid: 25 August 2022

MathSciNet: MR4474484
zbMATH: 07610764
Digital Object Identifier: 10.1214/22-AOS2178

Subjects:
Primary: 62H30
Secondary: 62C20 , 62F07

Keywords: Gaussian mixtures , Lloyd’s algorithm , sharp recovery

Rights: Copyright © 2022 Institute of Mathematical Statistics

JOURNAL ARTICLE
31 PAGES

This article is only available to subscribers.
It is not available for individual sale.
+ SAVE TO MY LIBRARY

Vol.50 • No. 4 • August 2022
Back to Top