Electronic Journal of Statistics

Constrained parameter estimation with uncertain priors for Bayesian networks

Ali Karimnezhad, Peter J. F. Lucas, and Ahmad Parsian

Full-text: Open access


In this paper we investigate the task of parameter learning of Bayesian networks and, in particular, we deal with the prior uncertainty of learning using a Bayesian framework. Parameter learning is explored in the context of Bayesian inference and we subsequently introduce Bayes, constrained Bayes and robust Bayes parameter learning methods. Bayes and constrained Bayes estimates of parameters are obtained to meet the twin objective of simultaneous estimation and closeness between the histogram of the estimates and the posterior estimates of the parameter histogram. Treating the prior uncertainty, we consider some classes of prior distributions and derive simultaneous Posterior Regret Gamma Minimax estimates of parameters. Evaluation of the merits of the various procedures was done using synthetic data and a real clinical dataset.

Article information

Electron. J. Statist. Volume 11, Number 2 (2017), 4000-4032.

Received: January 2017
First available in Project Euclid: 19 October 2017

Permanent link to this document

Digital Object Identifier

Zentralblatt MATH identifier

Primary: 62F15: Bayesian inference 62C10: Bayesian problems; characterization of Bayes procedures
Secondary: 62F30: Inference under constraints 62F35: Robustness and adaptive procedures

Bayesian networks constrained Bayes estimation directed acyclic graph posterior regret robust Bayesian learning

Creative Commons Attribution 4.0 International License.


Karimnezhad, Ali; J. F. Lucas, Peter; Parsian, Ahmad. Constrained parameter estimation with uncertain priors for Bayesian networks. Electron. J. Statist. 11 (2017), no. 2, 4000--4032. doi:10.1214/17-EJS1350. https://projecteuclid.org/euclid.ejs/1508378636

Export citation


  • [1] Berger, J. O., (1985).Statistical Decision Theory and Bayesian Analysis. Springer Science and Business Media.
  • [2] Berger, J. O. (1990). Robust Bayesian analysis: sensitivity to the, prior.J. Stat. Plan. Infer.25(3) 303-328.
  • [3] Berger, J. O. (1994). An overview of robust Bayesian, analysis.Test3(1) 5-124.
  • [4] Buntine, W. (1991). Theory refinement on Bayesian, networks.In: Proceedings of the Seventh conference on Uncertainty in Artificial Intelligence, 52-60. Morgan Kaufmann Publishers Inc.
  • [5] Chen, Y. C., Wheeler, T. A. and Kochenderfer, M. J. (2017). Learning discrete Bayesian networks from continuous, data.Artif. Intell. Res.59103-132
  • [6] Cheng, J., Greiner, R., Kelly, J., Bell, D. and Liu, W. (2002). Learning Bayesian networks from data: an information-theory based, approach.Artif. Intell.137(1-2) 43-90.
  • [7] Cooper, G. F. (1989). Current research directions in the development of expert systems based on belief, networks.Appl. Stoch. Model. Data Anal.5(1) 39-52.
  • [8] Cooper, G. F. and Herskovits, E. (1992). A Bayesian method for the induction of probabilistic networks from, data.Mach. Learn.9(4) 309-347.
  • [9] de Campos, C. and Qiang, J. (2008). Improving Bayesian network parameter learning using, constraints.In: Proceedings of the 19th International Conference on Pattern Recognition, 1-4.
  • [10] de Campos, L. M. (2006). A scoring function for learning Bayesian networks based on mutual information and conditional independence, tests.J. Mach. Learn. Res.72149-2187.
  • [11] Frey, J. and Cressie, N. (2003). Some results on constrained Bayes, estimators.Stat. Prob. Lett.65(4) 389-399.
  • [12] Gelman, A., Carlin, J. B., Stern, H. S. and Rubin, D. B., (2014).Bayesian Data Analysis. Vol. 2. Chapman and Hall/CRC Boca Raton, FL, USA.
  • [13] Ghosh, M. (1992). Constrained Bayes estimation with, applications.J. Am. Stat. Assoc.87(418) 533-540.
  • [14] Ghosh, M., Joon Kim, M. and Ho Kim, D. (2008). Constrained Bayes and empirical Bayes estimation under random effects normal ANOVA model with balanced loss, function.J. Stat. Plan. Infer.138(7) 2017-2028.
  • [15] Ghosh, M., Kim, M. J. and Kim, D. (2007). Constrained Bayes and empirical Bayes estimation with balanced loss, functions.Commun. Stat. Theory Methods36(8) 1527-1542.
  • [16] Ghosh, M. and Maiti, T. (1999). Adjusted Bayes estimators with applications to small area, estimation.Sankhya: Indian J. Stat., Series B71-90.
  • [17] Grünwald, P. D., (2007).The Minimum Description Length Principle. MIT press.
  • [18] Heckerman, D., Geiger, D. and Chickering, D. M. (1995). Learning Bayesian networks: The combination of knowledge and statistical, data.Mach. Learn.20(3) 197-243.
  • [19] Insua, D. R. and Ruggeri, F., (2000).Robust Bayesian Analysis. Vol. 152. Springer.
  • [20] Ickstadt, K., Bornkamp, B., Grzegorczyk, M., Wieczorek, J., Sheriff, M. R., Grecco, H. E. and Zamir, E. (2011). Nonparametric Bayesian, networks.Bayesian Stat.9283.
  • [21] Karimnezhad, A. and Moradi, F. (2016). Bayes, E-Bayes and robust Bayes prediction of a future observation under precautionary prediction loss functions with, applications.Appl. Math. Model.40(15) 7051-7061.
  • [22] Karimnezhad, A., Niazi, S., and Parsian, A. (2014). Bayes and robust Bayes prediction with an application to a rainfall prediction, problem.J. Korean Stat. Soci.43(2) 275-291.
  • [23] Karimnezhad, A. and Parsian, A. (2014). Robust Bayesian methodology with applications in credibility premium derivation and future claim size, prediction.AStA Adv. Stat. Anal.98(3) 287-303.
  • [24] Korb, K. B. and Nicholson, A. E., (2010).Bayesian Artificial Intelligence. 2nd Ed. Boca Raton, FL, USA: CRC Press, Inc.
  • [25] Koski, T. and Noble, J., (2011).Bayesian networks: an introduction. Vol. 924. John Wiley and Sons.
  • [26] Krishnan, T. and McLachlan, G., (1997).The EM Algorithm and Extensions. John Wiley and Sons.
  • [27] Lauritzen, S. L. (1995). The EM algorithm for graphical association models with missing, data.Comp. Stat. Data Anal.19(2) 191-201.
  • [28] Lauritzen, S. L. and Spiegelhalter, D.J. (1988). Local computations with probabilities on graphical structures and their application to expert, systems.J. Royal Stat. Soci. Series B (Methodological)157-224.
  • [29] Louis, T. A. (1984). Estimating a population of parameter values using Bayes and empirical Bayes, methods.J. Am. Stat. Assoc.79(386) 393-398.
  • [30] Nagarajan, R., Scutari, M. and Lèbre, S., (2013).Bayesian Networks in R with Applications in Systems Biology. New York. Springer.
  • [31] Nielsen, T. D. and Jensen, F. V., (2009).Bayesian Networks and Decision Graphs. Springer Science and Business Media.
  • [32] Oniésko, A., Lucas, P. and Druzdzel, M. J. (2001). Comparison of rule-based and Bayesian network approaches in medical diagnostic, systems.In: Artificial Intelligence in Medicine283-292. Springer.
  • [33] Pearl, J., (1988).Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann.
  • [34] Ramoni, M. and Sebastiani, P. (2001). Robust learning with missing, data.Mach. Learn.45147-170.
  • [35] Ramoni, M. and Sebastiani, P. (2003). Bayesian, methods.In: Intelligent Data Analysis, 131-168. Springer.
  • [36] Riggelsen, C. and Feelders, A. (2005). Learning Bayesian network models from incomplete data using importance, sampling.In: Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, 301-308.
  • [37] Robert, C., (2007).The Bayesian Choice: from Decision-Theoretic Foundations to Computational Implementation. Springer Science and Business Media.
  • [38] Singh, M. (1997). Learning Bayesian networks from incomplete, data.In: Proceedings of Fourteenth National Conference on Artificial Intelligence, Providence, RI, 534-539.
  • [39] Scutari, M. (2010). bnlearn: Bayesian network structure, learning.R package.
  • [40] Sebastiani, P., Abad, M. M., and Ramoni, M. F. (2010). Bayesian, networks.In: Data Mining and Knowledge Discovery Handbook, 175-208. Springer.
  • [41] Silander, T., Kontkanen, P. and Myllymaki, P. (2012). On sensitivity of the MAP Bayesian network structure to the equivalent sample size, parameter.arXiv preprint arXiv:1206.5293.
  • [42] Singh, P., Singh, S. and Singh, U. (2008). Bayes estimator of inverse Gaussian parameters under general entropy loss function using Lindley’s, approximation.Commun. Stat. Simul. Comput.37(9) 1750-1762.
  • [43] Spiegelhalter, D. J. (1989). Probabilistic reasoning in expert, systems.Am. J. Math. Management Sci.9(3-4) 191-210.
  • [44] Spirtes, P., Glymour, C. N. and Scheines, R., (2000).Causation, Prediction, and Search. Vol. 81. MIT Press.
  • [45] Steck, H. and Jaakkola, T. S. (2002). On the Dirichlet prior and Bayesian, regularization.In: Advances in Neural Information Processing Systems, 697-704.
  • [46] Thomas, G. B., Finney, R. L., Weir, M. D. and Giordano, F. R., (2001).Thomas’ Calculus. Addison-Wesley.
  • [47] Weiss, Y. and Freeman, W. T. (2001). Correctness of belief propagation in Gaussian graphical models of arbitrary, topology.Neural Comput.13(10) 2173-2200.