• Bernoulli
  • Volume 23, Number 2 (2017), 1202-1232.

Marginal likelihood and model selection for Gaussian latent tree and forest models

Mathias Drton, Shaowei Lin, Luca Weihs, and Piotr Zwiernik

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


Gaussian latent tree models, or more generally, Gaussian latent forest models have Fisher-information matrices that become singular along interesting submodels, namely, models that correspond to subforests. For these singularities, we compute the real log-canonical thresholds (also known as stochastic complexities or learning coefficients) that quantify the large-sample behavior of the marginal likelihood in Bayesian inference. This provides the information needed for a recently introduced generalization of the Bayesian information criterion. Our mathematical developments treat the general setting of Laplace integrals whose phase functions are sums of squared differences between monomials and constants. We clarify how in this case real log-canonical thresholds can be computed using polyhedral geometry, and we show how to apply the general theory to the Laplace integrals associated with Gaussian latent tree and forest models. In simulations and a data example, we demonstrate how the mathematical knowledge can be applied in model selection.

Article information

Bernoulli, Volume 23, Number 2 (2017), 1202-1232.

Received: December 2014
Revised: September 2015
First available in Project Euclid: 4 February 2017

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

algebraic statistics Gaussian graphical model latent tree models marginal likelihood multivariate normal distribution singular learning theory


Drton, Mathias; Lin, Shaowei; Weihs, Luca; Zwiernik, Piotr. Marginal likelihood and model selection for Gaussian latent tree and forest models. Bernoulli 23 (2017), no. 2, 1202--1232. doi:10.3150/15-BEJ775.

Export citation


  • [1] Arnol’d, V.I., Guseĭn-Zade, S.M. and Varchenko, A.N. (1988). Singularities of Differentiable Maps. Vol. II. Monographs in Mathematics 83. Boston, MA: Birkhäuser.
  • [2] Choi, M.J., Tan, V.Y.F., Anandkumar, A. and Willsky, A.S. (2011). Learning latent tree graphical models. J. Mach. Learn. Res. 12 1771–1812.
  • [3] Chow, C.K. and Liu, C.N. (1968). Approximating discrete probability distributions with dependence trees. IEEE Trans. Inform. Theory 14 462–467.
  • [4] Drton, M. (2009). Likelihood ratio tests and singularities. Ann. Statist. 37 979–1012.
  • [5] Drton, M. and Plummer, M. (2013). A Bayesian information criterion for singular models. Available at arXiv:1309.0911.
  • [6] Drton, M., Sturmfels, B. and Sullivant, S. (2009). Lectures on Algebraic Statistics. Oberwolfach Seminars 39. Basel: Birkhäuser.
  • [7] Edwards, D., de Abreu, G. and Labouriau, R. (2010). Selecting high-dimensional mixed graphical models using minimal AIC or BIC forests. BMC Bioinformatics 11 1–13.
  • [8] Friedman, N., Ninio, M., Pe’er, I. and Pupko, T. (2002). A structural EM algorithm for phylogenetic inference. J. Comput. Biol. 9 331–353.
  • [9] Hein, J., Jiang, T., Wang, L. and Zhang, K. (1996). On the complexity of comparing evolutionary trees. Discrete Appl. Math. 71 153–169.
  • [10] Hickey, G., Dehne, F., Rau-Chaplin, A. and Blouin, C. (2008). SPR distance computation for unrooted trees. Evol. Bioinform. 4 17–27.
  • [11] Lauritzen, S.L. (1996). Graphical Models. Oxford Statistical Science Series 17. New York: Oxford Univ. Press.
  • [12] Lin, S. (2011). Asymptotic approximation of marginal likelihood integrals. Available at arXiv:1003.5338.
  • [13] Mihaescu, R. and Pachter, L. (2008). Combinatorics of least-squares trees. Proc. Natl. Acad. Sci. USA 105 13206–13211.
  • [14] Mossel, E., Roch, S. and Sly, A. (2013). Robust estimation of latent tree graphical models: Inferring hidden states with inexact parameters. IEEE Trans. Inform. Theory 59 4357–4373.
  • [15] Rusakov, D. and Geiger, D. (2005). Asymptotic model selection for naive Bayesian networks. J. Mach. Learn. Res. 6 1–35.
  • [16] Schwarz, G. (1978). Estimating the dimension of a model. Ann. Statist. 6 461–464.
  • [17] Spirtes, P., Glymour, C. and Scheines, R. (2000). Causation, Prediction, and Search, 2nd ed. Cambridge, MA: MIT Press.
  • [18] Tan, V.Y.F., Anandkumar, A. and Willsky, A.S. (2011). Learning high-dimensional Markov forest distributions: Analysis of error rates. J. Mach. Learn. Res. 12 1617–1653.
  • [19] University of Dayton, Environmental protection agency average daily temperature archive. Available at http://academic.Udayton.Edu/kissock/http/Weather/default.Htm. Accessed 2015-09-20.
  • [20] Watanabe, S. (2009). Algebraic Geometry and Statistical Learning Theory. Cambridge Monographs on Applied and Computational Mathematics 25. Cambridge: Cambridge Univ. Press.
  • [21] Watanabe, S. (2010). Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. J. Mach. Learn. Res. 11 3571–3594.
  • [22] Watanabe, S. (2010). Equations of states in singular statistical estimation. Neural Netw. 23 20–34.
  • [23] Yamada, K. and Watanabe, S. (2012). Statistical learning theory of quasi-regular cases. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences 95 2479–2487.
  • [24] Zwiernik, P. (2011). An asymptotic behaviour of the marginal likelihood for general Markov models. J. Mach. Learn. Res. 12 3283–3310.