Bayesian Analysis

Bayesian Inference in Nonparanormal Graphical Models

Jami J. Mulgrave and Subhashis Ghosal

Advance publication

This article is in its final form and can be cited using the date of online publication and the DOI.

Full-text: Open access


Gaussian graphical models have been used to study intrinsic dependence among several variables, but the Gaussianity assumption may be restrictive in many applications. A nonparanormal graphical model is a semiparametric generalization for continuous variables where it is assumed that the variables follow a Gaussian graphical model only after some unknown smooth monotone transformations on each of them. We consider a Bayesian approach in the nonparanormal graphical model by putting priors on the unknown transformations through a random series based on B-splines where the coefficients are ordered to induce monotonicity. A truncated normal prior leads to partial conjugacy in the model and is useful for posterior simulation using Gibbs sampling. On the underlying precision matrix of the transformed variables, we consider a spike-and-slab prior and use an efficient posterior Gibbs sampling scheme. We use the Bayesian Information Criterion to choose the hyperparameters for the spike-and-slab prior. We present a posterior consistency result on the underlying transformation and the precision matrix. We study the numerical performance of the proposed method through an extensive simulation study and finally apply the proposed method on a real data set.

Article information

Bayesian Anal., Advance publication (2018), 27 pages.

First available in Project Euclid: 5 June 2019

Permanent link to this document

Digital Object Identifier

Primary: 62F15: Bayesian inference 62G05: Estimation 62-09: Graphical methods

Bayesian inference nonparanormal Gaussian graphical models sparsity continuous shrinkage prior

Creative Commons Attribution 4.0 International License.


Mulgrave, Jami J.; Ghosal, Subhashis. Bayesian Inference in Nonparanormal Graphical Models. Bayesian Anal., advance publication, 5 June 2019. doi:10.1214/19-BA1159.

Export citation


  • Arbel, J., Gayraud, G., and Rousseau, J. (2013). “Bayesian optimal adaptive estimation using a sieve prior.” Scandinavian Journal of Statistics, 40(3):549–570.
  • Armagan, A., B. Dunson, D., and Lee, J. (2013). “Generalized double Pareto shrinkage.” Statistica Sinica, 23(1):119–143.
  • Baldi, P., Brunak, S., Chauvin, Y., Andersen, C. A. F., and Nielsen, H. (2000). “Assessing the accuracy of prediction algorithms for classification: an overview.” Bioinformatics, 16(5):412–424.
  • Banerjee, O., El Ghaoui, L., and d’Aspremont, A. (2008). “Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data.” Journal of Machine Learning Research, 9:485–516.
  • Banerjee, S. and Ghosal, S. (2014). “Posterior convergence rates for estimating large precision matrices using graphical models.” Electronic Journal of Statistics, 8(2):2111–2137.
  • Banerjee, S. and Ghosal, S. (2015). “Bayesian structure learning in graphical models.” Journal of Multivariate Analysis, 136:147–162.
  • Berger, J. O. and Barbieri, M. M. (2004). “Optimal predictive model selection.” The Annals of Statistics, 32(3):870–897.
  • Bhattacharya, A., Pati, D., Pillai, N. S., and Dunson, D. B. (2015). “Dirichlet-Laplace priors for optimal shrinkage.” Journal of the American Statistical Association, 110(512):1479–1490.
  • Carter, C. K., Wong, F., and Kohn, R. (2011). “Constructing priors based on model size for nondecomposable Gaussian graphical models: a simulation based approach.” Journal of Multivariate Analysis, 102(5):871–883.
  • Carvalho, C. M., Polson, N. G., and Scott, J. G. (2010). “The horseshoe estimator for sparse signals.” Biometrika, 97(2):465–480.
  • Choudhuri, N., Ghosal, S., and Roy, A. (2007). “Nonparametric binary regression using a Gaussian process prior.” Statistical Methodology, 4(2):227–243.
  • Dahl, J., Roychowdhury, V., and Vandenberghe, L. (2005). “Maximum likelihood estimation of Gaussian graphical models: numerical implementation and topology selection.” Technical report. University of California, Los Angeles.
  • Dahl, J., Vandenberghe, L., and Roychowdhury, V. (2008). “Covariance selection for nonchordal graphs via chordal embedding.” Optimization Methods and Software, 23(4):501–520.
  • d’Aspremont, A., Banerjee, O., and El Ghaoui, L. (2008). “First-order methods for sparse covariance selection.” SIAM Journal on Matrix Analysis and Applications, 30(1):56–66.
  • de Jonge, R. and van Zanten, J. (2012). “Adaptive estimation of multivariate functions using conditionally Gaussian tensor-product spline priors.” Electronic Journal of Statistics, 6(0):1984–2001.
  • Dobra, A. and Lenkoski, A. (2011). “Copula Gaussian graphical models and their application to modeling functional disability data.” The Annals of Applied Statistics, 5(2A):969–993.
  • Foygel, R. and Drton, M. (2010). “Extended Bayesian information criteria for Gaussian graphical models.” In Advances in Neural Information Processing Systems 23, pages 604–612.
  • Friedman, J., Hastie, T., and Tibshirani, R. (2008). “Sparse inverse covariance estimation with the graphical lasso.” Biostatistics, 9(3):432–441.
  • Ghosal, S. and van der Vaart, A. (2017). Fundamentals of Nonparametric Bayesian Inference. Cambridge Series in Statistical and Probabilistic Mathematics (44). Cambridge University Press, Cambridge.
  • Giudici, P. (1999). “Decomposable graphical Gaussian model determination.” Biometrika, 86(4):785–801.
  • Hájek, J., Šidák, Z., and Sen, P. K. (1999). Theory of Rank Tests. Probability and Mathematical Statistics. Academic Press, Inc., San Diego, CA, second edition.
  • Lenk, P. J. and Choi, T. (2017). “Bayesian Analysis of Shape-Restricted Functions using Gaussian Process Priors.” Statistica Sinica, 27(1): 43–69.
  • Letac, G. and Massam, H. (2007). “Wishart distributions for decomposable graphs.” The Annals of Statistics, 35(3):1278–1323.
  • Liu, H., Han, F., Yuan, M., Lafferty, J., and Wasserman, L. (2012). “High-dimensional semiparametric Gaussian copula graphical models.” The Annals of Statistics, 40(4):2293–2326.
  • Liu, H., Lafferty, J. D., and Wasserman, L. A. (2009). “The nonparanormal: semiparametric estimation of high dimensional undirected graphs.” Journal of Machine Learning Research, 10:2295–2328.
  • Liu, H., Roeder, K., and Wasserman, L. (2010). “Stability approach to regularization selection (StARS) for high dimensional graphical models.” In Advances in Neural Information Processing Systems 23, pages 1432–1440, USA.
  • Lu, Z. (2009). “Smooth optimization approach for sparse covariance selection.” SIAM Journal on Optimization, 19(4):1807–1827.
  • Lysen, S. (2009). Permuted inclusion criterion: a variable selection technique. PhD thesis, Publicly Accessible Penn Dissertations, 28.
  • Mazumder, R. and Hastie, T. (2012). “The graphical lasso: new insights and alternatives.” Electronic Journal of Statistics, 6(0):2125–2149.
  • Meinshausen, N. and Buhlmann, P. (2006). “High-dimensional graphs and variable selection with the lasso.” The Annals of Statistics, 34(3):1436–1462.
  • Mohammadi, A., Abegaz, F., van den Heuvel, E., and Wit, E. C. (2017). “Bayesian modelling of Dupuytren disease by using Gaussian copula graphical models.” Journal of the Royal Statistical Society: Series C (Applied Statistics), 66(3):629–645.
  • Mohammadi, A. and Wit, E. C. (2015). “Bayesian structure learning in sparse Gaussian graphical models.” Bayesian Analysis, 10(1):109–138.
  • Mohammadi, R. and Wit, E. C. (2017). “BDgraph: an R package for Bayesian structure learning in graphical models.” arXiv preprint arXiv:1501.05108.
  • Mohammadi, R. and Wit, E. C. (2019). “BDgraph: Bayesian structure learning in graphical models using birth-death MCMC.” R package version 2.57.
  • Mulgrave, J. J. and Ghosal, S. (2019). “Supplementary Material for Bayesian Inference in Nonparanormal Graphical Models.” Bayesian Analysis.
  • Pakman, A. and Paninski, L. (2014). “Exact Hamiltonian Monte Carlo for truncated multivariate Gaussians.” Journal of Computational and Graphical Statistics, 23(2):518–542.
  • Pitt, M., Chan, D., and Kohn, R. (2006). “Efficient Bayesian inference for Gaussian copula regression models.” Biometrika, 93(3):537–554.
  • Rasmussen, C. E. and Williams, C. K. I. (2006). Gaussian Processes for Machine Learning. Adaptive Computation and Machine Learning. MIT Press, Cambridge, Mass.
  • Rivoirard, V. and Rousseau, J. (2012). “Posterior concentration rates for infinite dimensional exponential families.” Bayesian Analysis, 7(2):311–334.
  • Rothman, A. J., Bickel, P. J., Levina, E., and Zhu, J. (2008). “Sparse permutation invariant covariance estimation.” Electronic Journal of Statistics, 2(0):494–515.
  • Royston, J. P. (1982). “Algorithm AS 177: expected normal order statistics (exact and approximate).” Applied Statistics, 31(2):161.
  • Scheinberg, K., Ma, S., and Goldfarb, D. (2010). “Sparse inverse covariance selection via alternating linearization methods.” In Proceedings of the 23rd International Conference on Neural Information Processing Systems – Volume 2, NIPS’10, pages 2101–2109, USA. Curran Associates Inc.
  • Scheipl, F., Fahrmeir, L., and Kneib, T. (2012). “Spike-and-slab priors for function selection in structured additive regression models.” Journal of the American Statistical Association, 107(500):1518–1532.
  • Shen, W. and Ghosal, S. (2015). “Adaptive Bayesian procedures using random series priors: adaptive Bayesian procedures.” Scandinavian Journal of Statistics, 42(4):1194–1213.
  • Talluri, R., Baladandayuthapani, V., and Mallick, B. K. (2014). “Bayesian sparse graphical models and their mixtures: sparse graphical modelling.” Stat, 3(1):109–125.
  • Uhler, C., Lenkoski, A., and Richards, D. (2018). “Exact formulas for the normalizing constants of Wishart distributions for graphical models.” The Annals of Statistics, 46(1):90–118.
  • van der Vaart, A. and van Zanten, H. (2007). “Bayesian inference with rescaled Gaussian process priors.” Electronic Journal of Statistics, 1(0):433–448.
  • Wang, H. (2012). “Bayesian graphical lasso models and efficient posterior computation.” Bayesian Analysis, 7(4):867–886.
  • Wang, H. (2015). “Scaling it up: stochastic search structure learning in graphical models.” Bayesian Analysis, 10(2):351–377.
  • Wang, H. and Li, S. Z. (2012). “Efficient Gaussian graphical model determination under G-Wishart prior distributions.” Electronic Journal of Statistics, 6(0):168–198.
  • Wille, A., Zimmermann, P., Vranová, E., Fürholz, A., Laule, O., Bleuler, S., Hennig, L., Prelić, A., von Rohr, P., Thiele, L., Zitzler, E., Gruissem, W., and Bühlmann, P. (2004). “Sparse graphical Gaussian modeling of the isoprenoid gene network in Arabidopsis thaliana.” Genome Biology, 5(11):R92–R92.
  • Witten, D. M., Friedman, J. H., and Simon, N. (2011). “New insights and faster computations for the graphical lasso.” Journal of Computational and Graphical Statistics, 20(4):892–900.
  • Wong, F., Carter, C. K., and Kohn, R. (2003). “Efficient estimation of covariance selection models.” Biometrika, 90(4):809–830.
  • Yuan, M. and Lin, Y. (2007). “Model selection and estimation in the Gaussian graphical model.” Biometrika, 94(1):19–35.
  • Zhao, T., Li, X., Liu, H., Roeder, K., Lafferty, J., and Wasserman, L. (2015). “huge: high-dimensional undirected graph estimation.” R package version 1.2.7.

Supplemental materials

  • Supplementary Material for Bayesian Inference in Nonparanormal Graphical Models. The supplement that includes the proof to the consistency theorems.
  • GitHub Repository: Bayesian Nonparanormal. The code used to run the methods described in this paper are available on GitHub.