The Annals of Statistics
- Ann. Statist.
- Volume 46, Number 6A (2018), 2623-2656.
Margins of discrete Bayesian networks
Bayesian network models with latent variables are widely used in statistics and machine learning. In this paper, we provide a complete algebraic characterization of these models when the observed variables are discrete and no assumption is made about the state-space of the latent variables. We show that it is algebraically equivalent to the so-called nested Markov model, meaning that the two are the same up to inequality constraints on the joint probabilities. In particular, these two models have the same dimension, differing only by inequality constraints for which there is no general description. The nested Markov model is therefore the closest possible description of the latent variable model that avoids consideration of inequalities. A consequence of this is that the constraint finding algorithm of Tian and Pearl [In Proceedings of the 18th Conference on Uncertainty in Artificial Intelligence (2002) 519–527] is complete for finding equality constraints.
Latent variable models suffer from difficulties of unidentifiable parameters and nonregular asymptotics; in contrast the nested Markov model is fully identifiable, represents a curved exponential family of known dimension, and can easily be fitted using an explicit parameterization.
Ann. Statist., Volume 46, Number 6A (2018), 2623-2656.
Received: January 2017
Revised: August 2017
First available in Project Euclid: 7 September 2018
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
Evans, Robin J. Margins of discrete Bayesian networks. Ann. Statist. 46 (2018), no. 6A, 2623--2656. doi:10.1214/17-AOS1631. https://projecteuclid.org/euclid.aos/1536307228
- Supplement to “Margins of discrete Bayesian networks”. Technical proofs and some additional examples are contained in the supplement.