Statistical Science

Graphical Models for Genetic Analyses

Steffen L. Lauritzen and Nuala A. Sheehan

Full-text: Open access


This paper introduces graphical models as a natural environment in which to formulate and solve problems in genetics and related areas. Particular emphasis is given to the relationships among various local computation algorithms which have been developed within the hitherto mostly separate areas of graphical models and genetics. The potential of graphical models is explored and illustrated through a number of example applications where the genetic element is substantial or dominating.

Article information

Statist. Sci. Volume 18, Number 4 (2003), 489-514.

First available in Project Euclid: 8 April 2004

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Bayesian network forensic genetics linkage analysis local computation peeling probability propagation QTL analysis


Lauritzen, Steffen L.; Sheehan, Nuala A. Graphical Models for Genetic Analyses. Statist. Sci. 18 (2003), no. 4, 489--514. doi:10.1214/ss/1081443232.

Export citation


  • Adalsteinsson, S., Hersteinsson, P. and Gunnarsson, E. (1987). Fox colors in relation to colors in mice and sheep. J. Heredity 78 235–237.
  • Amestoy, P. R., Davis, T. A. and Duff, I. S. (1996). An approximate minimum degree ordering algorithm. SIAM J. Matrix Anal. Appl. 17 886–905.
  • Andersen, S. K., Olesen, K. G., Jensen, F. V. and Jensen, F. (1989). HUGIN–-a shell for building belief universes for expert systems. In Proc. 11th International Joint Conference on Artificial Intelligence 1080–1085. Morgan Kaufmann, San Mateo, CA.
  • Baum, L. E. (1972). An inequality and associated maximization technique in statistical estimation for probabilistic functions of Markov processes. In Inequalities. III (O. Shisha, ed.) 1–8. Academic Press, New York.
  • Berry, A., Bordat, J.-P. and Cogis, O. (2000). Generating all the minimal separators of a graph. Internat. J. Found. Comput. Sci. 11 397–403.
  • Bouchitté, V. and Todinca, I. (2001). Treewidth and minimum fill-in: Grouping the minimal separators. SIAM J. Comput. 31 212–232.
  • Cannings, C., Thompson, E. A. and Skolnick, M. H. (1978). Probability functions on complex pedigrees. Adv. in Appl. Probab. 10 26–61.
  • Cottingham, R. W., Idury, R. M. and Schäffer, A. A. (1993). Faster sequential genetic linkage computations. Amer. J. Human Genetics 53 252–263.
  • Cowell, R. G., Dawid, A. P., Lauritzen, S. L. and Spiegelhalter, D. J. (1999). Probabilistic Networks and Expert Systems. Springer, New York.
  • Dawid, A. P. (1992). Applications of a general propagation algorithm for probabilistic expert systems. Statist. Comput. 2 25–36.
  • Dawid, A. P. and Mortera, J. (1996). Coherent analysis of forensic identification evidence. J. Roy. Statist. Soc. Ser. B 58 425–443.
  • Dawid, A. P. and Mortera, J. (1998). Forensic identification with imperfect evidence. Biometrika 85 835–849.
  • Dawid, A. P., Mortera, J. and Pascali, V. L. (2001). Non-fatherhood or mutation? A probabilistic approach to parental exclusion in paternity testing. Forensic Sci. Int. 124 55–61.
  • Dawid, A. P., Mortera, J., Pascali, V. L. and van Boxel, D. (2002). Probabilistic expert systems for forensic infererence from genetic markers. Scand. J. Statist. 29 577–595.
  • Egeland, T., Mostad, P. F., Mevåg, B. and Stenersen, M. (2000). Beyond traditional paternity and identification cases: Selecting the most probable pedigree. Forensic Sci. Int. 110 47–59.
  • Elston, R. C. and Stewart, J. (1971). A general model for the genetic analysis of pedigree data. Human Heredity 21 523–542.
  • Falconer, D. S. and Mackay, T. F. C. (1996). Introduction to Quantitative Genetics, 4th ed. Addison Wesley Longman Limited, Harlow, UK.
  • Fernandez, S. A., Fernando, R. L., Gulbrandtsen, B., Totir, L. R. and Carriquiry, A. L. (2001). Sampling genotypes in large pedigrees with loops. Genetics Selection Evolution 33 337–367.
  • Fishelson, M. and Geiger, D. (2002). Exact genetic linkage computations for general pedigrees. Bioinformatics 18 S189–S198.
  • George, A. and Liu, J. W. H. (1989). The evolution of the minimum degree ordering algorithm. SIAM Rev. 31 1–19.
  • Gill, P. E., Ivanov, P. L., Kimpton, C., Piercy, R., Benson, N., Tully, G., Evett, I., Hagelberg, E. and Sullivan, K. (1994). Identification of the remains of the Romanov family by DNA analysis. Nature Genetics 6 130–135.
  • Haldane, J. B. S. (1919). The combination of linkage values and the calculation of distances between the loci of linked factors. J. Genetics 8 299–309.
  • Hansen, B. and Pedersen, C. B. (1994). Analysing complex pedigrees using Gibbs sampling: A theoretical and empirical investigation. Technical Report R-94-2032, Institute for Electronic Systems, Aalborg Univ., Aalborg, Denmark.
  • Heath, S. C. (2003). Genetic linkage analysis using Markov chain Monte Carlo techniques. In Highly Structured Stochastic Systems (P. J. Green, N. L. Hjort and S. Richardson, eds.) 363–381. Oxford Univ. Press.
  • Jensen, C. S. (1997). Blocking Gibbs sampling for inference in large and complex Bayesian networks with applications in genetics. Ph.D. thesis, Aalborg Univ., Aalborg, Denmark.
  • Jensen, C. S., Kjærulff, U. and Kong, A. (1995). Blocking Gibbs sampling in very large probabilistic expert systems. Int. J. Human-Computer Studies 42 647–666.
  • Jensen, C. S. and Kong, A. (1999). Blocking Gibbs sampling for linkage analysis in large pedigrees with many loops. Amer. J. Human Genetics 65 885–901.
  • Jensen, F. V. (1996). An Introduction to Bayesian Networks. Springer, New York.
  • Jensen, F. V. (2002). HUGIN API Reference Manual Version 5.4. HUGIN Expert Ltd., Aalborg, Denmark.
  • Jensen, F. V., Lauritzen, S. L. and Olesen, K. G. (1990). Bayesian updating in causal probabilistic networks by local computation. Computational Statistics Quarterly 4 269–282.
  • Kjærulff, U. (1992). Optimal decomposition of probabilistic networks by simulated annealing. Statist. Comput. 2 7–17.
  • Kong, A. (1991). Efficient methods for computing linkage likelihoods of recessive diseases in inbred pedigrees. Genetic Epidemiology 8 81–103.
  • Kruglyak, L., Daly, M. J., Reeve-Daly, M. P. and Lander, E. S. (1996). Parametric and nonparametric linkage analysis: A unified multipoint approach. Amer. J. Human Genetics 58 1347–1363.
  • Lander, E. S. and Green, P. (1987). Construction of multilocus genetic linkage maps in humans. Proc. Natl. Acad. Sci. U.S.A. 84 2363–2367.
  • Lander, E. S. and Schork, N. J. (1994). Genetic dissection of complex traits. Science 265 2037–2048.
  • Lange, K. and Elston, R. C. (1975). Extensions to pedigree analysis. I. Likelihood calculations for simple and complex pedigrees. Human Heredity 25 95–105.
  • Lauritzen, S. L. (1996). Graphical Models. Clarendon, Oxford.
  • Lauritzen, S. L. (2001). Causal inference from graphical models. In Complex Stochastic Systems (O. E. Barndorff-Nielsen, D. R. Cox and C. Klüppelberg, eds.) 63–107. Chapman and Hall/CRC Press, Boca Raton, FL.
  • Lauritzen, S. L. and Jensen, F. V. (1997). Local computation with valuations from a commutative semigroup. Ann. Math. Artificial Intelligence 21 51–69.
  • Lauritzen, S. L. and Spiegelhalter, D. J. (1988). Local computations with probabilities on graphical structures and their application to expert systems (with discussion). J. Roy. Statist. Soc. Ser. B 50 157–224.
  • Lund, M. S. and Jensen, C. S. (1999). Blocking Gibbs sampling in the mixed inheritance model using graph theory. Genetics Selection Evolution 31 3–24.
  • Mendel, G. (1866). Experiments in plant hybridisation. (Mendel's original paper in English translation, with a commentary by R. A. Fisher, J. H. Bennett, ed., was published by Oliver and Boyd, Edinburgh, 1965.)
  • Monaco, A. P., Bertelson, C. J., Middlesworth, W., Colletti, C. A., Aldridge, J., Fischbeck, K. H., Bartlett, R., Pericak-Vance, M. A., Roses, A. D. and Kunkel, L. M. (1985). Detection of deletions spanning the Duchenne muscular dystrophy locus using a tightly linked DNA segment. Nature 316 842–845.
  • Mortera, J., Dawid, A. P. and Lauritzen, S. L. (2003). Probabilistic expert systems for DNA mixture profiling. Theor. Population Biology 63 191–205.
  • Morton, N. E. (1955). Sequential tests for the detection of linkage. Amer. J. Human Genetics 7 277–318.
  • O'Connell, J. R. (2001). Rapid multipoint linkage analysis via inheritance vectors in the Elston–Stewart algorithm. Human Heredity 51 226–240.
  • Ott, J. (1999). Analysis of Human Genetic Linkage, 3rd ed. Johns Hopkins Univ. Press, Baltimore.
  • Pearl, J. (1986). Fusion, propagation and structuring in belief networks. Artificial Intelligence 29 241–288.
  • Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, San Mateo, CA.
  • Riordan, J. R., Rommens, J. M., Kerem, B., Alon, N., Rozmahel, R., Grzelczak, Z., Zielenski, J., Lok, S., Plavsic, N., Chou, J. L., Drumm, M. L., Iannuzzi, M. C., Collins, F. S. and Tsui, L. C. (1989). Identification of the cystic fibrosis gene: Cloning and characterization of complimentary DNA. Science 245 1066–1073.
  • Sham, P. (1997). Statistics in Human Genetics. Arnold, London.
  • Sheehan, N. A. (2000). On the application of Markov chain Monte Carlo methods to genetic analyses on complex pedigrees. Internat. Statist. Rev. 68 83–110.
  • Sheehan, N. A., Gulbrandtsen, B., Lund, M. S. and Sorensen, D. A. (2002). Bayesian MCMC mapping of quantitative trait loci in a half-sib design: A graphical model perspective. Internat. Statist. Rev. 70 241–267.
  • Shenoy, P. P. and Shafer, G. (1990). Axioms for probability and belief–function propagation. In Uncertainty in Artificial Intelligence (R. D. Shachter, T. S. Levitt, L. N. Kanal and J. F. Lemmer, eds.) 4 169–198. North-Holland, Amsterdam.
  • Shoikhet, K. and Geiger, D. (1997). A practical algorithm for finding optimal triangulations. In Proc. 14th National Conference on Artificial Intelligence 185–190. AAAI Press, Menlo Park, CA.
  • Silvers, W. K. (1979). The Coat Colors of Mice. Springer, New York.
  • Skjøth, F., Lohi, O. and Thomas, A. W. (1994). Genetic models for the inheritance of the silver colour mutation of foxes. Genetical Res. 64 11–18.
  • Sobel, E. and Lange, K. (1996). Descent graphs in pedigree analysis: Applications to haplotyping, location scores, and marker-sharing statistics. Amer. J. Human Genetics 58 1323–1337.
  • Spiegelhalter, D. J. (1990). Fast algorithms for probabilistic reasoning in influence diagrams, with applications in genetics and expert systems (with discussion). In Influence Diagrams, Belief Nets and Decision Analysis (R. M. Oliver and J. Q. Smith, eds.) 361–384. Wiley, Chichester, U.K.
  • Spiegelhalter, D. J. (1998). Bayesian graphical modelling: A case-study in monitoring health outcomes. Appl. Statist. 47 115–133.
  • Thomas, A. (1985). Data structures, methods of approximation and optimal computation for pedigree analysis. Ph.D. thesis, Cambridge Univ.
  • Thomas, A., Gutin, A., Abkevich, V. and Bansal, A. (2000). Multilocus linkage analysis by blocked Gibbs sampling. Statist. Comput. 10 259–269.
  • Thompson, E. A. (1981). Pedigree analysis of Hodgkin's disease in a Newfoundland genealogy. Ann. Human Genetics 45 279–292.
  • Thompson, E. A. (1986). Pedigree Analysis in Human Genetics. Johns Hopkins Univ. Press, Baltimore.
  • Thompson, E. A. (1994). Monte Carlo likelihood in genetic mapping. Statist. Sci. 9 355–366.
  • Thompson, E. A. (2000). Statistical Inference from Genetic Data on Pedigrees. IMS, Beachwood, OH.
  • Thompson, E. A. (2001). Monte Carlo methods on genetic structures. In Complex Stochastic Systems (O. E. Barndorff-Nielsen, D. R. Cox and C. Klüppelberg, eds.) 176–218. Chapman and Hall/CRC Press, Boca Raton, FL.
  • Thompson, E. A. and Heath, S. C. (1999). Estimation of conditional multilocus gene identity among relatives. In Statistics in Molecular Biology and Genetics (F. Seillier-Moiseiwitsch, ed.) 95–113. IMS, Hayward, CA.
  • Thompson, E. A. and Wijsman, E. M. (1990). The Gibbs sampler on extended pedigrees: Monte Carlo methods for the genetic analysis of complex traits. Technical Report 193, Dept. Statistics, Univ. Washington, Seattle.
  • Yannakakis, M. (1981). Computing the minimum fill-in is NP-complete. SIAM J. Algebraic Discrete Methods 2 77–79.