A nonparametric Bayesian extension of Factor Analysis (FA) is proposed where observed data Y is modeled as a linear superposition, G, of a potentially infinite number of hidden factors, X. The Indian Buffet Process (IBP) is used as a prior on G to incorporate sparsity and to allow the number of latent features to be inferred. The model’s utility for modeling gene expression data is investigated using randomly generated data sets based on a known sparse connectivity matrix for E. Coli, and on three biological data sets of increasing complexity.
"Nonparametric Bayesian sparse factor models with application to gene expression modeling." Ann. Appl. Stat. 5 (2B) 1534 - 1552, June 2011. https://doi.org/10.1214/10-AOAS435