Electronic Journal of Statistics

High-dimensional Ising model selection with Bayesian information criteria

Rina Foygel Barber and Mathias Drton

Full-text: Open access


We consider the use of Bayesian information criteria for selection of the graph underlying an Ising model. In an Ising model, the full conditional distributions of each variable form logistic regression models, and variable selection techniques for regression allow one to identify the neighborhood of each node and, thus, the entire graph. We prove high-dimensional consistency results for this pseudo-likelihood approach to graph selection when using Bayesian information criteria for the variable selection problems in the logistic regressions. The results pertain to scenarios of sparsity, and following related prior work the information criteria we consider incorporate an explicit prior that encourages sparsity.

Article information

Electron. J. Statist., Volume 9, Number 1 (2015), 567-607.

First available in Project Euclid: 24 March 2015

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62F12: Asymptotic properties of estimators 62J12: Generalized linear models

Bayesian information criterion graphical model logistic regression log-linear model neighborhood selection variable selection


Barber, Rina Foygel; Drton, Mathias. High-dimensional Ising model selection with Bayesian information criteria. Electron. J. Statist. 9 (2015), no. 1, 567--607. doi:10.1214/15-EJS1012. https://projecteuclid.org/euclid.ejs/1427203129

Export citation


  • Anandkumar, A., Tan, V. Y. F., Huang, F. and Willsky, A. S. (2012). High-dimensional structure estimation in Ising models: Local separation criterion., Ann. Statist. 40 1346–1375.
  • Besag, J. E. (1972). Nearest-neighbour systems and the auto-logistic model for binary data., J. Roy. Statist. Soc. Ser. B 34 75–83.
  • Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems., J. Roy. Statist. Soc. Ser. B 36 192–236. With discussion by D. R. Cox, A. G. Hawkes, P. Clifford, P. Whittle, K. Ord, R. Mead, J. M. Hammersley, and M. S. Bartlett and with a reply by the author.
  • Bogdan, M., Ghosh, J. K. and Doerge, R. W. (2004). Modifying the Schwarz Bayesian information criterion to locate multiple interacting quantitative trait loci., Genetics 167 989–999.
  • Broman, K. W. and Speed, T. P. (2002). A model selection approach for the identification of quantitative trait loci in experimental crosses., J. R. Stat. Soc. Ser. B Stat. Methodol. 64 641–656.
  • Bühlmann, P. and van de Geer, S. (2011)., Statistics for high-dimensional data. Springer Series in Statistics. Springer, Heidelberg. Methods, theory and applications.
  • Chen, J. and Chen, Z. (2008). Extended Bayesian information criteria for model selection with large model spaces., Biometrika 95 759–771.
  • Chen, J. and Chen, Z. (2012). Extended BIC for small-$n$-large-$P$ sparse GLM., Statist. Sinica 22 555–574.
  • Foygel, R. (2012). Prediction and model selection for high-dimensional data with sparse or low-rank structure. PhD thesis, The University of, Chicago.
  • Foygel, R. and Drton, M. (2011). Bayesian model choice and information criteria in sparse generalized linear models., ArXiv e-prints 1112.5635.
  • Foygel, R. and Drton, M. (2014). High-dimensional Ising model selection with Bayesian information criteria., ArXiv e-prints 1403.3374v1.
  • Friedman, J., Hastie, T. and Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent., Journal of Statistical Software 33 1–22.
  • Frommlet, F., Ruhaltinger, F., Twaróg, P. and Bogdan, M. (2012). Modified versions of Bayesian information criterion for genome-wide association studies., Comput. Statist. Data Anal. 56 1038–1051.
  • Höfling, H. and Tibshirani, R. (2009). Estimation of sparse binary pairwise Markov networks using pseudo-likelihoods., J. Mach. Learn. Res. 10 883–906.
  • Hothorn, T., Bühlmann, P., Kneib, T., Schmid, M. and Hofner, B. (2013). mboost: Model-based boosting., R package version 2.2-3.
  • Jalali, A., Johnson, C. C. and Ravikumar, P. K. (2011). On learning discrete graphical models using greedy methods. In, Advances in Neural Information Processing Systems 1935–1943.
  • Kindermann, R. and Snell, J. L. (1980)., Markov random fields and their applications. Contemporary Mathematics 1. American Mathematical Society, Providence, R.I.
  • Koltchinskii, V. (2011)., Oracle inequalities in empirical risk minimization and sparse recovery problems. Lecture Notes in Mathematics 2033. Springer, Heidelberg. Lectures from the 38th Probability Summer School held in Saint-Flour, 2008, École d’Été de Probabilités de Saint-Flour. [Saint-Flour Probability Summer School].
  • Lauritzen, S. L. (1996)., Graphical models. Oxford Statistical Science Series 17. The Clarendon Press Oxford University Press, New York. Oxford Science Publications.
  • Loh, P.-L. and Wainwright, M. J. (2013). Structure estimation for discrete graphical models: Generalized covariance matrices and their inverses., Ann. Statist. 41 3022–3049.
  • Lorentz, G. G., Golitschek, M. v. and Makovoz, Y. (1996)., Constructive approximation. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences] 304. Springer-Verlag, Berlin. Advanced problems.
  • Luo, S. and Chen, Z. (2013). Selection consistency of EBIC for GLIM with non-canonical links and diverging number of parameters., Stat. Interface 6 275–284.
  • Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the lasso., Ann. Statist. 34 1436–1462.
  • Meinshausen, N. and Bühlmann, P. (2010). Stability selection., J. R. Stat. Soc. Ser. B Stat. Methodol. 72 417–473.
  • Menne, M. J., Williams Jr., C. N. and Vose, R. S. (2011). United States Historical Climatology Network Daily Temperature, Precipitation, and Snow, Data.
  • Ravikumar, P., Wainwright, M. J. and Lafferty, J. D. (2010). High-dimensional Ising model selection using $\ell_1$-regularized logistic regression., Ann. Statist. 38 1287–1319.
  • Roudi, Y., Aurell, E. and Hertz, J. A. (2009). Statistical physics of pairwise probability models., Frontiers in Computational Neuroscience 3.
  • Santhanam, N. P. and Wainwright, M. J. (2012). Information-theoretic limits of selecting binary graphical models in high dimensions., IEEE Transactions on Information Theory 58 4117–4134.
  • Schwarz, G. (1978). Estimating the dimension of a model., Ann. Statist. 6 461–464.
  • Scott, J. G. and Berger, J. O. (2010). Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem., Ann. Statist. 38 2587–2619.
  • Shorack, G. R. (2000)., Probability for statisticians. Springer Texts in Statistics. Springer-Verlag, New York.
  • Żak-Szatkowska, M. and Bogdan, M. (2011). Modified versions of the Bayesian information criterion for sparse generalized linear models., Comput. Statist. Data Anal. 55 2908–2924.