Electronic Journal of Statistics

Distributional equivalence and structure learning for bow-free acyclic path diagrams

Christopher Nowzohour, Marloes H. Maathuis, Robin J. Evans, and Peter Bühlmann

Full-text: Open access


We consider the problem of structure learning for bow-free acyclic path diagrams (BAPs). BAPs can be viewed as a generalization of linear Gaussian DAG models that allow for certain hidden variables. We present a first method for this problem using a greedy score-based search algorithm. We also prove some necessary and some sufficient conditions for distributional equivalence of BAPs which are used in an algorithmic approach to compute (nearly) equivalent model structures. This allows us to infer lower bounds of causal effects. We also present applications to real and simulated datasets using our publicly available R-package.

Article information

Electron. J. Statist. Volume 11, Number 2 (2017), 5342-5374.

Received: October 2016
First available in Project Euclid: 28 December 2017

Permanent link to this document

Digital Object Identifier

Zentralblatt MATH identifier

Causal inference structure learning hidden variables latent variables path diagrams structural equation models distributional equivalence greedy search

Creative Commons Attribution 4.0 International License.


Nowzohour, Christopher; Maathuis, Marloes H.; Evans, Robin J.; Bühlmann, Peter. Distributional equivalence and structure learning for bow-free acyclic path diagrams. Electron. J. Statist. 11 (2017), no. 2, 5342--5374. doi:10.1214/17-EJS1372. https://projecteuclid.org/euclid.ejs/1514430421

Export citation


  • Ali, R. A., Richardson, T. S., and Spirtes, P. Markov equivalence for ancestral, graphs.Annals of Statistics, 37(5B) :2808–2837, 2009.
  • Brito, C. and Pearl, J. A new identification condition for recursive models with correlated, errors.Structural Equation Modeling, 9(4):459–474, 2002.
  • Chickering, D. M. Learning Bayesian networks is NP-complete., InLearning from Data, volume 112 ofLecture Notes in Statistics, pages 121–130. 1996.
  • Chickering, D. M. Optimal structure identification with greedy, search.Journal of Machine Learning Research, 3:507–554, 2002.
  • Claassen, T., Mooij, J. M., and Heskes, T. Learning sparse causal models is not NP-hard., InProceedings of the Twenty-Ninth Conference Annual Conference on Uncertainty in Artificial Intelligence (UAI-13), pages 172–181, 2013.
  • Colombo, D., Maathuis, M. H., Kalisch, M., and Richardson, T. S. Learning high-dimensional directed acyclic graphs with latent and selection, variables.The Annals of Statistics, 40(1):294–321, 2012.
  • Cox, D. A., Little, J., and O’Shea, D.Ideals, Varieties, and Algorithms. Springer Verlag, 2007.
  • Drton, M., Eichler, M., and Richardson, T. S. Computing maximum likelihood estimates in recursive linear models with correlated, errors.Journal of Machine Learning Research, 10 :2329–2348, 2009.
  • Drton, M., Foygel, R., and Sullivant, S. Global identifiability of linear structural equation, models.The Annals of Statistics, 39(2):865–886, 2011.
  • Duncan, O., T.Introduction to Structural Equation Research. Academic Press, 1975.
  • Evans, R. J. Graphs for margins of bayesian, networks.Scandinavian Journal of Statistics, 43(3):625–648, 2016.
  • Fox, C. J., Käufl, A., and Drton, M. On the causal interpretation of acyclic mixed graphs under multivariate, normality.Linear Algebra and Its Applications, 473:93–113, 2015.
  • Foygel, R., Draisma, J., and Drton, M. Half-trek criterion for generic identifiability of linear structural equation, models.The Annals of Statistics, 40(3) :1682–1713, 2012.
  • Frot, B., Nandy, P., and Maathuis, M. H. Learning directed acyclic graphs with hidden variables via latent gaussian graphical model selection. 2017., URLhttps://arxiv.org/abs/1708.01151. Preprint.
  • Genz, A., Bretz, F., Miwa, T., Mi, X., Leisch, F., Scheipl, F., and Hothorn, T.mvtnorm: Multivariate Normal and t Distributions, 2014. URLhttp://CRAN.R-project.org/package=mvtnorm. R package version 1.0–2.
  • Glymour, C. and Scheines, R. Causal modeling with the TETRAD, program.Synthese, 68(1):37–63, 1986.
  • Hanley, J. A. and McNeil, B. J. A method of comparing the areas under receiver operating characteristic curves derived from the same, cases.Radiology, 148(3):839–843, 1983.
  • Jöreskog, K. G. A general method for analysis of covariance, structures.Biometrika, 57(2):239–251, 1970.
  • Jöreskog, K., G.LISREL 8: User’s Reference Guide. Scientific Software International, 2001.
  • Kuipers, J. and Moffa, G. Uniform random generation of large acyclic, digraphs.Statistics and Computing, 25(2):227–242, 2015.
  • Maathuis, M. H., Kalisch, M., and Bühlmann, P. Estimating high-dimensional intervention effects from observational, data.The Annals of Statistics, 37(6A) :3133–3164, 2009.
  • Maathuis, M. H., Colombo, D., Kalisch, M., and Bühlmann, P. Predicting causal effects in large-scale systems from observational, data.Nature Methods, 7(4), 2010.
  • Malinsky, D. and Spirtes, P. Estimating bounds on causal effects in high-dimensional and possibly confounded, systems.International Journal of Approximate Reasoning, 88:371–384, 2017.
  • Marchetti, G. M., Drton, M., and Sadeghi, K.ggm: Functions for graphical Markov models, 2015. URLhttp://CRAN.R-project.org/package=ggm. R package version 2.3.
  • Mardia, K. V., Kent, J. T., and Bibby, J., M.Multivariate Analysis. Academic Press, 1979.
  • Melançon, G., Dutout, I., and Bousquet-Mélou, M. Random generation of directed acyclic, graphs.Electronic Notes in Discrete Mathematics, 10:202–207, 2001.
  • Nowzohour, C.Estimating Causal Networks from Multivariate Observational Data. PhD thesis, ETH Zürich, 2015. URLhttp://e-collection.library.ethz.ch/view/eth:48348.
  • Nowzohour, C. greedyBAPs, 2017., URLhttps://github.com/cnowzohour/greedyBAPs.
  • Pearl, J.Causality. Cambridge University Press, 2000.
  • R Core, Team.R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2015. URLhttp://www.R-project.org/.
  • Richardson, T. A factorization criterion for acyclic directed mixed graphs., InProceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence (UAI-09), 2009.
  • Richardson, T. and Spirtes, P. Ancestral graph markov, models.The Annals of Statistics, 30(4):962 –1030, 2002.
  • Sachs, K., Perez, O., Pe’er, D., Lauffenburger, D. A., and Nolan, G. P. Causal protein-signaling networks derived from multiparameter single-cell, data.Science, 308 (5721):523–529, 2005.
  • Shpitser, I., Richardson, T. S., Robins, J. M., and Evans, R. Parameter and structure learning in nested markov models., InUAI Workshop on Causal Structure Learning, 2012.
  • Shpitser, I., Evans, R. J., Richardson, T. S., and Robins, J. M. Introduction to nested markov, models.Behaviormetrika, 41(1):3–39, 2014.
  • Silander, T. and Myllymäki, P. A simple approach for finding the globally optimal bayesian network structure., InProceedings of the Twenty-Second Annual Conference on Uncertainty in Artificial Intelligence (UAI-06), pages 445–452, 2006.
  • Silva, R. and Ghahramani, Z. Bayesian inference for gaussian mixed graph models., InProceedings of the Twenty-Second Annual Conference on Uncertainty in Artificial Intelligence (UAI-06), pages 453–460, 2006.
  • Spirtes, P., Glymour, C., and Scheines, R.Causation, Prediction, and Search. Springer Verlag, 1993.
  • Spirtes, P., Richardson, T., Meek, C., Scheines, R., and Glymour, C. Using path diagrams as a structural equation modelling, tool.Sociological Methods & Research, 27(2):182–225, 1998.
  • Tian, J. Identifying direct causal effects in linear models., InAAAI’05 Proceedings of the 20th national conference on Artificial intelligence, 2005.
  • Verma, T. S. and Pearl, J. Equivalence and synthesis of causal models., InProceedings of the sixth annual Conference on Uncertainty in Artificial Intelligence (UAI-90), pages 220–227, 1991.
  • Williams, L. Equivalent models: Concepts, problems, and alternatives., InThe Handbook of Structural Equation Modeling, pages 247–260, 2012.
  • Wright, S. The method of path, coefficients.The Annals of Mathematical Statistics, 5(2):161–215, 1934.
  • Wright, S. Path coefficients and path regressions: Alternative or complementary, concepts?Biometrics, 16(2):189–202, 1960.