Electronic Journal of Statistics

Constrained parameter estimation with uncertain priors for Bayesian networks

Ali Karimnezhad, Peter J. F. Lucas, and Ahmad Parsian

Full-text: Open access


In this paper we investigate the task of parameter learning of Bayesian networks and, in particular, we deal with the prior uncertainty of learning using a Bayesian framework. Parameter learning is explored in the context of Bayesian inference and we subsequently introduce Bayes, constrained Bayes and robust Bayes parameter learning methods. Bayes and constrained Bayes estimates of parameters are obtained to meet the twin objective of simultaneous estimation and closeness between the histogram of the estimates and the posterior estimates of the parameter histogram. Treating the prior uncertainty, we consider some classes of prior distributions and derive simultaneous Posterior Regret Gamma Minimax estimates of parameters. Evaluation of the merits of the various procedures was done using synthetic data and a real clinical dataset.

Article information

Electron. J. Statist., Volume 11, Number 2 (2017), 4000-4032.

Received: January 2017
First available in Project Euclid: 19 October 2017

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62F15: Bayesian inference 62C10: Bayesian problems; characterization of Bayes procedures
Secondary: 62F30: Inference under constraints 62F35: Robustness and adaptive procedures

Bayesian networks constrained Bayes estimation directed acyclic graph posterior regret robust Bayesian learning

Creative Commons Attribution 4.0 International License.


Karimnezhad, Ali; J. F. Lucas, Peter; Parsian, Ahmad. Constrained parameter estimation with uncertain priors for Bayesian networks. Electron. J. Statist. 11 (2017), no. 2, 4000--4032. doi:10.1214/17-EJS1350. https://projecteuclid.org/euclid.ejs/1508378636

Export citation


  • [1] Berger, J. O. (1985)., Statistical Decision Theory and Bayesian Analysis. Springer Science and Business Media.
  • [2] Berger, J. O. (1990). Robust Bayesian analysis: sensitivity to the prior., J. Stat. Plan. Infer. 25(3) 303-328.
  • [3] Berger, J. O. (1994). An overview of robust Bayesian analysis., Test 3(1) 5-124.
  • [4] Buntine, W. (1991). Theory refinement on Bayesian networks., In: Proceedings of the Seventh conference on Uncertainty in Artificial Intelligence, 52-60. Morgan Kaufmann Publishers Inc.
  • [5] Chen, Y. C., Wheeler, T. A. and Kochenderfer, M. J. (2017). Learning discrete Bayesian networks from continuous data., Artif. Intell. Res. 59 103-132
  • [6] Cheng, J., Greiner, R., Kelly, J., Bell, D. and Liu, W. (2002). Learning Bayesian networks from data: an information-theory based approach., Artif. Intell. 137(1-2) 43-90.
  • [7] Cooper, G. F. (1989). Current research directions in the development of expert systems based on belief networks., Appl. Stoch. Model. Data Anal. 5(1) 39-52.
  • [8] Cooper, G. F. and Herskovits, E. (1992). A Bayesian method for the induction of probabilistic networks from data., Mach. Learn. 9(4) 309-347.
  • [9] de Campos, C. and Qiang, J. (2008). Improving Bayesian network parameter learning using constraints., In: Proceedings of the 19th International Conference on Pattern Recognition, 1-4.
  • [10] de Campos, L. M. (2006). A scoring function for learning Bayesian networks based on mutual information and conditional independence tests., J. Mach. Learn. Res. 7 2149-2187.
  • [11] Frey, J. and Cressie, N. (2003). Some results on constrained Bayes estimators., Stat. Prob. Lett. 65(4) 389-399.
  • [12] Gelman, A., Carlin, J. B., Stern, H. S. and Rubin, D. B. (2014)., Bayesian Data Analysis. Vol. 2. Chapman and Hall/CRC Boca Raton, FL, USA.
  • [13] Ghosh, M. (1992). Constrained Bayes estimation with applications., J. Am. Stat. Assoc. 87(418) 533-540.
  • [14] Ghosh, M., Joon Kim, M. and Ho Kim, D. (2008). Constrained Bayes and empirical Bayes estimation under random effects normal ANOVA model with balanced loss function., J. Stat. Plan. Infer. 138(7) 2017-2028.
  • [15] Ghosh, M., Kim, M. J. and Kim, D. (2007). Constrained Bayes and empirical Bayes estimation with balanced loss functions., Commun. Stat. Theory Methods 36(8) 1527-1542.
  • [16] Ghosh, M. and Maiti, T. (1999). Adjusted Bayes estimators with applications to small area estimation., Sankhya: Indian J. Stat., Series B 71-90.
  • [17] Grünwald, P. D. (2007)., The Minimum Description Length Principle. MIT press.
  • [18] Heckerman, D., Geiger, D. and Chickering, D. M. (1995). Learning Bayesian networks: The combination of knowledge and statistical data., Mach. Learn. 20(3) 197-243.
  • [19] Insua, D. R. and Ruggeri, F. (2000)., Robust Bayesian Analysis. Vol. 152. Springer.
  • [20] Ickstadt, K., Bornkamp, B., Grzegorczyk, M., Wieczorek, J., Sheriff, M. R., Grecco, H. E. and Zamir, E. (2011). Nonparametric Bayesian networks., Bayesian Stat. 9 283.
  • [21] Karimnezhad, A. and Moradi, F. (2016). Bayes, E-Bayes and robust Bayes prediction of a future observation under precautionary prediction loss functions with applications., Appl. Math. Model. 40(15) 7051-7061.
  • [22] Karimnezhad, A., Niazi, S., and Parsian, A. (2014). Bayes and robust Bayes prediction with an application to a rainfall prediction problem., J. Korean Stat. Soci. 43(2) 275-291.
  • [23] Karimnezhad, A. and Parsian, A. (2014). Robust Bayesian methodology with applications in credibility premium derivation and future claim size prediction., AStA Adv. Stat. Anal. 98(3) 287-303.
  • [24] Korb, K. B. and Nicholson, A. E. (2010)., Bayesian Artificial Intelligence. 2nd Ed. Boca Raton, FL, USA: CRC Press, Inc.
  • [25] Koski, T. and Noble, J. (2011)., Bayesian networks: an introduction. Vol. 924. John Wiley and Sons.
  • [26] Krishnan, T. and McLachlan, G. (1997)., The EM Algorithm and Extensions. John Wiley and Sons.
  • [27] Lauritzen, S. L. (1995). The EM algorithm for graphical association models with missing data., Comp. Stat. Data Anal. 19(2) 191-201.
  • [28] Lauritzen, S. L. and Spiegelhalter, D.J. (1988). Local computations with probabilities on graphical structures and their application to expert systems., J. Royal Stat. Soci. Series B (Methodological) 157-224.
  • [29] Louis, T. A. (1984). Estimating a population of parameter values using Bayes and empirical Bayes methods., J. Am. Stat. Assoc. 79(386) 393-398.
  • [30] Nagarajan, R., Scutari, M. and Lèbre, S. (2013)., Bayesian Networks in R with Applications in Systems Biology. New York. Springer.
  • [31] Nielsen, T. D. and Jensen, F. V. (2009)., Bayesian Networks and Decision Graphs. Springer Science and Business Media.
  • [32] Oniésko, A., Lucas, P. and Druzdzel, M. J. (2001). Comparison of rule-based and Bayesian network approaches in medical diagnostic systems., In: Artificial Intelligence in Medicine 283-292. Springer.
  • [33] Pearl, J. (1988)., Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann.
  • [34] Ramoni, M. and Sebastiani, P. (2001). Robust learning with missing data., Mach. Learn. 45 147-170.
  • [35] Ramoni, M. and Sebastiani, P. (2003). Bayesian methods., In: Intelligent Data Analysis, 131-168. Springer.
  • [36] Riggelsen, C. and Feelders, A. (2005). Learning Bayesian network models from incomplete data using importance sampling., In: Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, 301-308.
  • [37] Robert, C. (2007)., The Bayesian Choice: from Decision-Theoretic Foundations to Computational Implementation. Springer Science and Business Media.
  • [38] Singh, M. (1997). Learning Bayesian networks from incomplete data., In: Proceedings of Fourteenth National Conference on Artificial Intelligence, Providence, RI, 534-539.
  • [39] Scutari, M. (2010). bnlearn: Bayesian network structure learning., R package.
  • [40] Sebastiani, P., Abad, M. M., and Ramoni, M. F. (2010). Bayesian networks., In: Data Mining and Knowledge Discovery Handbook, 175-208. Springer.
  • [41] Silander, T., Kontkanen, P. and Myllymaki, P. (2012). On sensitivity of the MAP Bayesian network structure to the equivalent sample size parameter., arXiv preprint arXiv:1206.5293.
  • [42] Singh, P., Singh, S. and Singh, U. (2008). Bayes estimator of inverse Gaussian parameters under general entropy loss function using Lindley’s approximation., Commun. Stat. Simul. Comput. 37(9) 1750-1762.
  • [43] Spiegelhalter, D. J. (1989). Probabilistic reasoning in expert systems., Am. J. Math. Management Sci. 9(3-4) 191-210.
  • [44] Spirtes, P., Glymour, C. N. and Scheines, R. (2000)., Causation, Prediction, and Search. Vol. 81. MIT Press.
  • [45] Steck, H. and Jaakkola, T. S. (2002). On the Dirichlet prior and Bayesian regularization., In: Advances in Neural Information Processing Systems, 697-704.
  • [46] Thomas, G. B., Finney, R. L., Weir, M. D. and Giordano, F. R. (2001)., Thomas’ Calculus. Addison-Wesley.
  • [47] Weiss, Y. and Freeman, W. T. (2001). Correctness of belief propagation in Gaussian graphical models of arbitrary topology., Neural Comput. 13(10) 2173-2200.