Bernoulli

  • Bernoulli
  • Volume 25, Number 2 (2019), 848-876.

Smooth, identifiable supermodels of discrete DAG models with latent variables

Robin J. Evans and Thomas S. Richardson

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

We provide a parameterization of the discrete nested Markov model, which is a supermodel that approximates DAG models (Bayesian network models) with latent variables. Such models are widely used in causal inference and machine learning. We explicitly evaluate their dimension, show that they are curved exponential families of distributions, and fit them to data. The parameterization avoids the irregularities and unidentifiability of latent variable models. The parameters used are all fully identifiable and causally-interpretable quantities.

Article information

Source
Bernoulli, Volume 25, Number 2 (2019), 848-876.

Dates
Received: December 2015
Revised: January 2017
First available in Project Euclid: 6 March 2019

Permanent link to this document
https://projecteuclid.org/euclid.bj/1551862837

Digital Object Identifier
doi:10.3150/17-BEJ1005

Zentralblatt MATH identifier
07049393

Keywords
Bayesian network DAG nested Markov model parameterization

Citation

Evans, Robin J.; Richardson, Thomas S. Smooth, identifiable supermodels of discrete DAG models with latent variables. Bernoulli 25 (2019), no. 2, 848--876. doi:10.3150/17-BEJ1005. https://projecteuclid.org/euclid.bj/1551862837


Export citation

References

  • [1] Bishop, C.M. (2007). Pattern Recognition and Machine Learning. Information Science and Statistics. New York: Springer.
  • [2] Darwiche, A. (2009). Modeling and Reasoning with Bayesian Networks. Cambridge: Cambridge Univ. Press.
  • [3] Dawid, A.P. (2002). Influence diagrams for causal modelling and inference. Int. Stat. Rev. 70 161–189.
  • [4] Drton, M. (2009). Discrete chain graph models. Bernoulli 15 736–753.
  • [5] Drton, M. (2009). Likelihood ratio tests and singularities. Ann. Statist. 37 979–1012.
  • [6] Drton, M. and Richardson, T.S. (2008). Binary models for marginal independence. J. R. Stat. Soc. Ser. B. Stat. Methodol. 70 287–309.
  • [7] Evans, R.J. (2018). Margins of discrete Bayesian networks. Ann. Statist. 46 2623–2656.
  • [8] Evans, R.J. and Richardson, T.S. (2010). Maximum likelihood fitting of acyclic directed mixed graphs to binary data. In Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence 177–184.
  • [9] Evans, R.J. and Richardson, T.S. (2013). Marginal log-linear parameters for graphical Markov models. J. R. Stat. Soc. Ser. B. Stat. Methodol. 75 743–768.
  • [10] Evans, R.J. and Richardson, T.S. (2014). Markovian acyclic directed mixed graphs for discrete data. Ann. Statist. 42 1452–1482.
  • [11] Hauser, R.M., Sewell, W.H. and Herd, P. Wisconsin Longitudinal Study (WLS), 1957–2012. Available at http://www.ssc.wisc.edu/wlsresearch/documentation/. Version 13.03, Univ. Wisconsin–Madison, WLS.
  • [12] Huang, J.C. and Frey, B.J. (2008). Cumulative distribution networks and the derivative-sum-product algorithm. In Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence 290–297.
  • [13] Mond, D., Smith, J. and van Straten, D. (2003). Stochastic factorizations, sandwiched simplices and the topology of the space of explanations. Proc. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 459 2821–2845.
  • [14] Pearl, J. and Verma, T.S. (1992). A statistical semantics for causation. Stat. Comput. 2 91–95.
  • [15] Pearl, J. (2009). Causality: Models, Reasoning, and Inference, 2nd ed. Cambridge: Cambridge Univ. Press.
  • [16] Richardson, T. (2003). Markov properties for acyclic directed mixed graphs. Scand. J. Stat. 30 145–157.
  • [17] Richardson, T.S., Evans, R.J., Robins, J.M. and Shpitser, I. (2017). Nested Markov properties for acyclic directed mixed graphs. Preprint. Available at arXiv:1701.06686.
  • [18] Robins, J. (1986). A new approach to causal inference in mortality studies with a sustained exposure period – Application to control of the healthy worker survivor effect. Math. Model. 7 1393–1512.
  • [19] Shpitser, I., Evans, R.J., Richardson, T.S. and Robins, J.M. (2013). Sparse nested Markov models with log-linear parameters. In Proceedings of the 29th Conference on Uncertainty in Artificial Intelligence 576–585.
  • [20] Shpitser, I., Evans, R.J., Richardson, T.S. and Robins, J.M. (2014). Introduction to nested Markov models. Behaviormetrika 41 3–39.
  • [21] Shpitser, I. and Pearl, J. (2008). Dormant independence. Technical Report R-340, Cognitive Systems Laboratory, University of California, Los Angeles.
  • [22] Shpitser, I., Richardson, T.S., Robins, J.M. and Evans, R.J. (2011). Parameter and structure learning in mixed graph models of post-truncation independence. Draft.
  • [23] Silva, R., Blundell, C. and Teh, Y.W. (2011). Mixed cumulative distribution networks. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (AISTATS) 15 670–678.
  • [24] Silva, R. and Ghahramani, Z. (2009). The hidden life of latent variables: Bayesian learning with mixed graph models. J. Mach. Learn. Res. 10 1187–1238.
  • [25] Richardson, T.S. Spirtes, P.L. and (2002). Ancestral graph Markov models. Ann. Statist. 30 962-1030.
  • [26] Tian, J. (2002). Studies in causal reasoning and learning. Ph.D. thesis, University of California, Los Angeles.
  • [27] Tian, J. and Pearl, J. (2002). A general identification condition for causal effects. In Proceedings of the 18th National Conference on Artificial Intelligence. AAAI.
  • [28] Tian, J. and Pearl, J. (2002). On the testable implications of causal models with hidden variables. In Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intelligence (UAI-02) 519–527. Morgan Kaufmann Publishers Inc.
  • [29] Verma, T.S. and Pearl, J. (1991). Equivalence and synthesis of causal models. In Proceedings of the 7th Conference on Uncertainty in Artificial Intelligence (UAI-91) 255–268.
  • [30] Wermuth, N. (2011). Probability distributions with summary graph structure. Bernoulli 17 845–879.