Open Access
June 2024 Contaminated Gibbs-Type Priors
Federico Camerlenghi, Riccardo Corradin, Andrea Ongaro
Author Affiliations +
Bayesian Anal. 19(2): 347-376 (June 2024). DOI: 10.1214/22-BA1358

Abstract

Gibbs-type priors are combinatorial processes widely used as key components in several Bayesian nonparametric models. By virtue of their flexibility and mathematical tractability, they turn out to be predominant priors in species sampling problems and mixture modeling. We introduce a new family of processes which extends the Gibbs-type one, by including a contaminant component in the model to account for an excess of observations with frequency one. We first investigate the induced random partition, the associated predictive distribution, the asymptotic behavior of the total number of blocks and the number of blocks with a given frequency: all the results we obtain are in closed form and easily interpretable. A remarkable aspect of contaminated Gibbs-type priors relies on their predictive structure, compared to the one of the standard Gibbs-type family: it depends on the additional sampling information on the number of observations with frequency one out of the observed sample. As a noteworthy example we focus on the contaminated version of the Pitman-Yor process, which turns out to be analytically tractable and computationally feasible. Finally we pinpoint the advantage of our construction in different applications: we show how it helps to improve predictive inference in a species-related dataset exhibiting a high number of species with frequency one; we also discuss the use of the proposed construction in mixture models to perform density estimation and outlier detection.

Funding Statement

The authors gratefully acknowledge the financial support from the Italian Ministry of Education, University and Research (MIUR), “Dipartimenti di Eccellenza” grant 2018-2022, and the DEMS Data Science Lab for supporting this work through computational resources.

Acknowledgments

The authors are grateful to the Associate Editor and two anonymous Referees for their valuable comments and suggestions, which lead to a substantial improvement of the paper. Federico Camerlenghi is a member of the Gruppo Nazionale per l’Analisi Matematica, la Probabilità e le loro Applicazioni (GNAMPA) of the Istituto Nazionale di Alta Matematica (INdAM).

Citation

Download Citation

Federico Camerlenghi. Riccardo Corradin. Andrea Ongaro. "Contaminated Gibbs-Type Priors." Bayesian Anal. 19 (2) 347 - 376, June 2024. https://doi.org/10.1214/22-BA1358

Information

Published: June 2024
First available in Project Euclid: 9 April 2024

Digital Object Identifier: 10.1214/22-BA1358

Subjects:
Primary: 60G25 , 62F15 , 62G05

Keywords: Bayesian nonparametrics , Gibbs-type priors , Mixture models , Random partitions , species sampling models

Rights: © 2024 International Society for Bayesian Analysis

Vol.19 • No. 2 • June 2024
Back to Top