Electronic Journal of Statistics

Distributional equivalence and structure learning for bow-free acyclic path diagrams

Christopher Nowzohour, Marloes H. Maathuis, Robin J. Evans, and Peter Bühlmann

Full-text: Open access


We consider the problem of structure learning for bow-free acyclic path diagrams (BAPs). BAPs can be viewed as a generalization of linear Gaussian DAG models that allow for certain hidden variables. We present a first method for this problem using a greedy score-based search algorithm. We also prove some necessary and some sufficient conditions for distributional equivalence of BAPs which are used in an algorithmic approach to compute (nearly) equivalent model structures. This allows us to infer lower bounds of causal effects. We also present applications to real and simulated datasets using our publicly available R-package.

Article information

Electron. J. Statist. Volume 11, Number 2 (2017), 5342-5374.

Received: October 2016
First available in Project Euclid: 28 December 2017

Permanent link to this document

Digital Object Identifier

Causal inference structure learning hidden variables latent variables path diagrams structural equation models distributional equivalence greedy search

Creative Commons Attribution 4.0 International License.


Nowzohour, Christopher; Maathuis, Marloes H.; Evans, Robin J.; Bühlmann, Peter. Distributional equivalence and structure learning for bow-free acyclic path diagrams. Electron. J. Statist. 11 (2017), no. 2, 5342--5374. doi:10.1214/17-EJS1372. https://projecteuclid.org/euclid.ejs/1514430421

Export citation


  • Ali, R. A., Richardson, T. S., and Spirtes, P. Markov equivalence for ancestral graphs., Annals of Statistics, 37(5B) :2808–2837, 2009.
  • Brito, C. and Pearl, J. A new identification condition for recursive models with correlated errors., Structural Equation Modeling, 9(4):459–474, 2002.
  • Chickering, D. M. Learning Bayesian networks is NP-complete. In, Learning from Data, volume 112 of Lecture Notes in Statistics, pages 121–130. 1996.
  • Chickering, D. M. Optimal structure identification with greedy search., Journal of Machine Learning Research, 3:507–554, 2002.
  • Claassen, T., Mooij, J. M., and Heskes, T. Learning sparse causal models is not NP-hard. In, Proceedings of the Twenty-Ninth Conference Annual Conference on Uncertainty in Artificial Intelligence (UAI-13), pages 172–181, 2013.
  • Colombo, D., Maathuis, M. H., Kalisch, M., and Richardson, T. S. Learning high-dimensional directed acyclic graphs with latent and selection variables., The Annals of Statistics, 40(1):294–321, 2012.
  • Cox, D. A., Little, J., and O’Shea, D., Ideals, Varieties, and Algorithms. Springer Verlag, 2007.
  • Drton, M., Eichler, M., and Richardson, T. S. Computing maximum likelihood estimates in recursive linear models with correlated errors., Journal of Machine Learning Research, 10 :2329–2348, 2009.
  • Drton, M., Foygel, R., and Sullivant, S. Global identifiability of linear structural equation models., The Annals of Statistics, 39(2):865–886, 2011.
  • Duncan, O. T., Introduction to Structural Equation Research. Academic Press, 1975.
  • Evans, R. J. Graphs for margins of bayesian networks., Scandinavian Journal of Statistics, 43(3):625–648, 2016.
  • Fox, C. J., Käufl, A., and Drton, M. On the causal interpretation of acyclic mixed graphs under multivariate normality., Linear Algebra and Its Applications, 473:93–113, 2015.
  • Foygel, R., Draisma, J., and Drton, M. Half-trek criterion for generic identifiability of linear structural equation models., The Annals of Statistics, 40(3) :1682–1713, 2012.
  • Frot, B., Nandy, P., and Maathuis, M. H. Learning directed acyclic graphs with hidden variables via latent gaussian graphical model selection. 2017. URL, https://arxiv.org/abs/1708.01151. Preprint.
  • Genz, A., Bretz, F., Miwa, T., Mi, X., Leisch, F., Scheipl, F., and Hothorn, T., mvtnorm: Multivariate Normal and t Distributions, 2014. URL http://CRAN.R-project.org/package=mvtnorm. R package version 1.0–2.
  • Glymour, C. and Scheines, R. Causal modeling with the TETRAD program., Synthese, 68(1):37–63, 1986.
  • Hanley, J. A. and McNeil, B. J. A method of comparing the areas under receiver operating characteristic curves derived from the same cases., Radiology, 148(3):839–843, 1983.
  • Jöreskog, K. G. A general method for analysis of covariance structures., Biometrika, 57(2):239–251, 1970.
  • Jöreskog, K. G., LISREL 8: User’s Reference Guide. Scientific Software International, 2001.
  • Kuipers, J. and Moffa, G. Uniform random generation of large acyclic digraphs., Statistics and Computing, 25(2):227–242, 2015.
  • Maathuis, M. H., Kalisch, M., and Bühlmann, P. Estimating high-dimensional intervention effects from observational data., The Annals of Statistics, 37(6A) :3133–3164, 2009.
  • Maathuis, M. H., Colombo, D., Kalisch, M., and Bühlmann, P. Predicting causal effects in large-scale systems from observational data., Nature Methods, 7(4), 2010.
  • Malinsky, D. and Spirtes, P. Estimating bounds on causal effects in high-dimensional and possibly confounded systems., International Journal of Approximate Reasoning, 88:371–384, 2017.
  • Marchetti, G. M., Drton, M., and Sadeghi, K., ggm: Functions for graphical Markov models, 2015. URL http://CRAN.R-project.org/package=ggm. R package version 2.3.
  • Mardia, K. V., Kent, J. T., and Bibby, J. M., Multivariate Analysis. Academic Press, 1979.
  • Melançon, G., Dutout, I., and Bousquet-Mélou, M. Random generation of directed acyclic graphs., Electronic Notes in Discrete Mathematics, 10:202–207, 2001.
  • Nowzohour, C., Estimating Causal Networks from Multivariate Observational Data. PhD thesis, ETH Zürich, 2015. URL http://e-collection.library. ethz.ch/view/eth:48348.
  • Nowzohour, C. greedyBAPs, 2017. URL, https://github.com/cnowzohour/ greedyBAPs.
  • Pearl, J., Causality. Cambridge University Press, 2000.
  • R Core Team., R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2015. URL http://www.R-project.org/.
  • Richardson, T. A factorization criterion for acyclic directed mixed graphs. In, Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence (UAI-09), 2009.
  • Richardson, T. and Spirtes, P. Ancestral graph markov models., The Annals of Statistics, 30(4):962 –1030, 2002.
  • Sachs, K., Perez, O., Pe’er, D., Lauffenburger, D. A., and Nolan, G. P. Causal protein-signaling networks derived from multiparameter single-cell data., Science, 308 (5721):523–529, 2005.
  • Shpitser, I., Richardson, T. S., Robins, J. M., and Evans, R. Parameter and structure learning in nested markov models. In, UAI Workshop on Causal Structure Learning, 2012.
  • Shpitser, I., Evans, R. J., Richardson, T. S., and Robins, J. M. Introduction to nested markov models., Behaviormetrika, 41(1):3–39, 2014.
  • Silander, T. and Myllymäki, P. A simple approach for finding the globally optimal bayesian network structure. In, Proceedings of the Twenty-Second Annual Conference on Uncertainty in Artificial Intelligence (UAI-06), pages 445–452, 2006.
  • Silva, R. and Ghahramani, Z. Bayesian inference for gaussian mixed graph models. In, Proceedings of the Twenty-Second Annual Conference on Uncertainty in Artificial Intelligence (UAI-06), pages 453–460, 2006.
  • Spirtes, P., Glymour, C., and Scheines, R., Causation, Prediction, and Search. Springer Verlag, 1993.
  • Spirtes, P., Richardson, T., Meek, C., Scheines, R., and Glymour, C. Using path diagrams as a structural equation modelling tool., Sociological Methods & Research, 27(2):182–225, 1998.
  • Tian, J. Identifying direct causal effects in linear models. In, AAAI’05 Proceedings of the 20th national conference on Artificial intelligence, 2005.
  • Verma, T. S. and Pearl, J. Equivalence and synthesis of causal models. In, Proceedings of the sixth annual Conference on Uncertainty in Artificial Intelligence (UAI-90), pages 220–227, 1991.
  • Williams, L. Equivalent models: Concepts, problems, and alternatives. In, The Handbook of Structural Equation Modeling, pages 247–260, 2012.
  • Wright, S. The method of path coefficients., The Annals of Mathematical Statistics, 5(2):161–215, 1934.
  • Wright, S. Path coefficients and path regressions: Alternative or complementary concepts?, Biometrics, 16(2):189–202, 1960.