## The Annals of Statistics

- Ann. Statist.
- Volume 45, Number 5 (2017), 2016-2045.

### Bayesian Poisson calculus for latent feature modeling via generalized Indian Buffet Process priors

#### Abstract

Statistical latent feature models, such as latent factor models, are models where each observation is associated with a vector of latent features. A general problem is how to select the number/types of features, and related quantities. In Bayesian statistical machine learning, one seeks (nonparametric) models where one can learn such quantities in the presence of observed data. The Indian Buffet Process (IBP), devised by Griffiths and Ghahramani (2005), generates a (sparse) latent binary matrix with columns representing a potentially unbounded number of features and where each row corresponds to an individual or object. Its generative scheme is cast in terms of customers entering sequentially an Indian Buffet restaurant and selecting previously sampled dishes as well as new dishes. Dishes correspond to latent features shared by individuals. The IBP has been applied to a wide range of statistical problems. Recent works have demonstrated the utility of generalizations to nonbinary matrices. The purpose of this work is to describe a unified mechanism for construction, Bayesian analysis, and practical sampling of broad generalizations of the IBP that generate (sparse) matrices with general entries. An adaptation of the Poisson partition calculus is employed to handle the complexities, including combinatorial aspects, of these models. Our work reveals a spike and slab characterization, and also presents a general framework for multivariate extensions. We close by highlighting a multivariate IBP with condiments, and the role of a stable-Beta Dirichlet multivariate prior.

#### Article information

**Source**

Ann. Statist., Volume 45, Number 5 (2017), 2016-2045.

**Dates**

Received: April 2015

Revised: November 2015

First available in Project Euclid: 31 October 2017

**Permanent link to this document**

https://projecteuclid.org/euclid.aos/1509436826

**Digital Object Identifier**

doi:10.1214/16-AOS1517

**Mathematical Reviews number (MathSciNet)**

MR3718160

**Zentralblatt MATH identifier**

06821117

**Subjects**

Primary: 60C05: Combinatorial probability 60G09: Exchangeability

Secondary: 60G57: Random measures 60E99: None of the above, but in this section

**Keywords**

Bayesian statistical machine learning Indian buffet process nonparametric latent feature models Poisson process calculus spike and slab priors

#### Citation

James, Lancelot F. Bayesian Poisson calculus for latent feature modeling via generalized Indian Buffet Process priors. Ann. Statist. 45 (2017), no. 5, 2016--2045. doi:10.1214/16-AOS1517. https://projecteuclid.org/euclid.aos/1509436826