• Bernoulli
  • Volume 19, Number 4 (2013), 1465-1483.

Multivariate Bernoulli distribution

Bin Dai, Shilin Ding, and Grace Wahba

Full-text: Open access


In this paper, we consider the multivariate Bernoulli distribution as a model to estimate the structure of graphs with binary nodes. This distribution is discussed in the framework of the exponential family, and its statistical properties regarding independence of the nodes are demonstrated. Importantly the model can estimate not only the main effects and pairwise interactions among the nodes but also is capable of modeling higher order interactions, allowing for the existence of complex clique effects. We compare the multivariate Bernoulli model with existing graphical inference models – the Ising model and the multivariate Gaussian model, where only the pairwise interactions are considered. On the other hand, the multivariate Bernoulli distribution has an interesting property in that independence and uncorrelatedness of the component random variables are equivalent. Both the marginal and conditional distributions of a subset of variables in the multivariate Bernoulli distribution still follow the multivariate Bernoulli distribution. Furthermore, the multivariate Bernoulli logistic model is developed under generalized linear model theory by utilizing the canonical link function in order to include covariate information on the nodes, edges and cliques. We also consider variable selection techniques such as LASSO in the logistic model to impose sparsity structure on the graph. Finally, we discuss extending the smoothing spline ANOVA approach to the multivariate Bernoulli logistic model to enable estimation of non-linear effects of the predictor variables.

Article information

Bernoulli, Volume 19, Number 4 (2013), 1465-1483.

First available in Project Euclid: 27 August 2013

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Bernoulli distribution generalized linear models LASSO smoothing spline


Dai, Bin; Ding, Shilin; Wahba, Grace. Multivariate Bernoulli distribution. Bernoulli 19 (2013), no. 4, 1465--1483. doi:10.3150/12-BEJSP10.

Export citation


  • [1] Banerjee, O., El Ghaoui, L. and d’Aspremont, A. (2008). Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data. J. Mach. Learn. Res. 9 485–516.
  • [2] Craven, P. and Wahba, G. (1979). Smoothing noisy data with spline functions. Estimating the correct degree of smoothing by the method of generalized cross-validation. Numer. Math. 31 377–403.
  • [3] Dai, B. (2012). Multivariate Bernoulli distribution models. Technical report. Dept. Statistics, Univ. Wisconsin, Madison, WI 53706.
  • [4] Ding, S., Wahba, G. and Zhu, X. (2011). Learning higher-order graph structure with features by structure penalty. In Advances in Neural Information Processing Systems 24 253–261. 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12–14 December 2011, Granada, Spain.
  • [5] Friedman, J., Hastie, T. and Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software 44 1–22.
  • [6] Gao, F., Wahba, G., Klein, R. and Klein, B. (2001). Smoothing spline ANOVA for multivariate Bernoulli observations, with application to ophthalmology data. J. Amer. Statist. Assoc. 96 127–160.
  • [7] Gu, C. (2002). Smoothing Spline ANOVA Models. Springer Series in Statistics. New York: Springer.
  • [8] Ising, E. (1925). Beitrag zur Theorie des Ferromagnetismus. Z. Phys. 31 253–258.
  • [9] Ma, X. (2010). Penalized regression in reproducing kernel Hilbert spaces with randomized covariate data. Technical Report No. 1159. Dept. Statistics, Univ. Wisconsin, Madison, WI 53706.
  • [10] McCullagh, P. and Nelder, J. (1989). Generalized Linear Models. New York: Chapman & Hall.
  • [11] Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the lasso. Ann. Statist. 34 1436–1462.
  • [12] Park, T. and Casella, G. (2008). The Bayesian lasso. J. Amer. Statist. Assoc. 103 681–686.
  • [13] Ravikumar, P., Wainwright, M.J. and Lafferty, J.D. (2010). High-dimensional Ising model selection using $\ell_{1}$-regularized logistic regression. Ann. Statist. 38 1287–1319.
  • [14] Shi, W., Wahba, G., Irizarry, R., Corrado Bravo, H. and Wright, S. (2012). The partitioned LASSO-patternsearch algorithm with application to gene expression data. BMC Bioinformatics 13 98–110.
  • [15] Shi, W., Wahba, G., Wright, S., Lee, K., Klein, R. and Klein, B. (2008). LASSO-Patternsearch algorithm with application to ophthalmology and genomic data. Stat. Interface 1 137–153.
  • [16] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Stat. Methodol. 58 267–288.
  • [17] Wahba, G. (1990). Spline Models for Observational Data. CBMS-NSF Regional Conference Series in Applied Mathematics 59. Philadelphia, PA: SIAM.
  • [18] Wahba, G., Wang, Y., Gu, C., Klein, R. and Klein, B. (1995). Smoothing spline ANOVA for exponential families, with application to the Wisconsin Epidemiological Study of Diabetic Retinopathy. Ann. Statist. 23 1865–1895.
  • [19] Wainwright, M. and Jordan, M. (2008). Graphical models, exponential families, and variational inference. Foundations and Trends in Machine Learning 1 1–305.
  • [20] Whittaker, J. (1990). Graphical Models in Applied Mathematical Multivariate Statistics. New York: Wiley.
  • [21] Xiang, D. and Wahba, G. (1994). A generalized approximate cross validation for smoothing splines with non-Gaussian data. Technical Report No. 930. Dept. Statistics, Univ. Wisconsin, Madison, WI 53706.
  • [22] Xue, L., Zou, H. and Cai, T. (2012). Nonconcave penalized composite conditional likelihood estimation of sparse Ising models. Ann. Statist. 40 1403–1429.
  • [23] Zhao, P. and Yu, B. (2006). On model selection consistency of Lasso. J. Mach. Learn. Res. 7 2541–2563.
  • [24] Zhao, P. and Yu, B. (2007). Stagewise lasso. J. Mach. Learn. Res. 8 2701–2726.