The Annals of Applied Statistics

Causal graphical models in systems genetics: A unified framework for joint inference of causal network and genetic architecture for correlated phenotypes

Elias Chaibub Neto, Mark P. Keller, Alan D. Attie, and Brian S. Yandell

Full-text: Open access


Causal inference approaches in systems genetics exploit quantitative trait loci (QTL) genotypes to infer causal relationships among phenotypes. The genetic architecture of each phenotype may be complex, and poorly estimated genetic architectures may compromise the inference of causal relationships among phenotypes. Existing methods assume QTLs are known or inferred without regard to the phenotype network structure. In this paper we develop a QTL-driven phenotype network method (QTLnet) to jointly infer a causal phenotype network and associated genetic architecture for sets of correlated phenotypes. Randomization of alleles during meiosis and the unidirectional influence of genotype on phenotype allow the inference of QTLs causal to phenotypes. Causal relationships among phenotypes can be inferred using these QTL nodes, enabling us to distinguish among phenotype networks that would otherwise be distribution equivalent. We jointly model phenotypes and QTLs using homogeneous conditional Gaussian regression models, and we derive a graphical criterion for distribution equivalence. We validate the QTLnet approach in a simulation study. Finally, we illustrate with simulated data and a real example how QTLnet can be used to infer both direct and indirect effects of QTLs and phenotypes that co-map to a genomic region.

Article information

Ann. Appl. Stat., Volume 4, Number 1 (2010), 320-339.

First available in Project Euclid: 11 May 2010

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Causal graphical models QTL mapping joint inference of phenotype network and genetic architecture systems genetics homogeneous conditional Gaussian regression models Markov chain Monte Carlo


Chaibub Neto, Elias; Keller, Mark P.; Attie, Alan D.; Yandell, Brian S. Causal graphical models in systems genetics: A unified framework for joint inference of causal network and genetic architecture for correlated phenotypes. Ann. Appl. Stat. 4 (2010), no. 1, 320--339. doi:10.1214/09-AOAS288.

Export citation


  • Andrei, A. and Kendziorski, C. (2009). An efficient method for identifying statistical interactors in graphical models. Biostatistics 10 706–718.
  • Aten, J. E., Fuller, T. F., Lusis, A. J. and Horvath, S. (2008). Using genetic markers to orient the edges in quantitative trait networks: The NEO software. BMC Sys. Biol. 2 34.
  • Banerjee, S., Yandell, B. S. and Yi, N. (2008). Bayesian QTL mapping for multiple traits. Genetics 179 2275–2289.
  • Breitling, R., Li, Y., Tesson, B. M., Fu, J., Wu, C., Wiltshire, T., Gerrits, A., Bystrykh, L. V., de Haan, G., Su, A. I. and Jansen, R. C. (2008). Genetical genomics: Spotlight on QTL hotspots. PLoS Genet. 4 e1000232.
  • Broman, K., Wu, H., Sen, S. and Churchill, G. A. (2003). R/qtl: QTL mapping in experimental crosses. Bioinformatics 19 889–890.
  • Chaibub Neto, E., Ferrara, C., Attie, A. D. and Yandell, B. S. (2008). Inferring causal phenotype networks from segregating populations. Genetics 179 1089–1100.
  • Chaibub Neto, E., Keller, M. P., Attie, A. D. and Yandell, B. S. (2009). Supplement to “Causal graphical models in systems genetics: A unified framework for joint inference of causal network and genetic architecture for correlated phenotypes.” DOI: 10.1214/09-AOAS288SUPP.
  • Chen, L. S., Emmert-Streib, F. and Storey, J. D. (2007). Harnessing naturally randomized transcription to infer regulatory relationships among genes. Genome Biology 8 R219.
  • Crick, F. H. C. (1958). On Protein Synthesis. Symp. Soc. Exp. Biol. XII 139–163.
  • Dawid, P. (2007). Fundamentals of statistical causality. Research Report 279, Dept. Statistical Science, Univ. College London.
  • Doss, S., Schadt, E. E., Drake, T. A., Lusis, A. J. (2005). Cis-acting expression quantitative trait loci in mice. Genome Research 15 681–691.
  • Ghazalpour, A., Doss, S., Zhang, B., Wang, S., Plaisier, C., Castellanos, R., Brozell, A., Schadt, E. E., Drake, T. A., Lusis, A. J. and Horvath, S. (2006). Integrating genetic and network analysis to characterize genes related to mouse weight. PLoS Genetics 2 e130.
  • Grzegorczyk, M. and Husmeier, D. (2008). Improving the structure MCMC sampler for Bayesian networks by introducing a new edge reversal move. Machine Learning 71 265–305.
  • Haley, C. and Knott, S. (1992). A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69 315–324.
  • Heckerman, D., Geiger, D. and Chickering, D. (1995). Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning 20 197–243.
  • Hoeting, J. A., Madigan, D., Raftery, A. E. and Volinsky, C. T. (1999). Bayesian model averaging: A tutorial (with discussion and rejoinder by authors). Statist. Sci. 14 382–417.
  • Husmeier, D. (2003). Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic Bayesian networks. Bioinformatics 19 2271–2282.
  • Kulp, D. C. and Jagalur, M. (2006). Causal inference of regulator-target pairs by gene mapping of expression phenotypes. BMC Genomics 7 125.
  • Lauritzen, S. (1996). Graphical Models. Oxford Statistical Science Series 17. Oxford Univ. Press, New York.
  • Li, R., Tsaih, S. W., Shockley, K., Stylianou, I. M., Wergedal, J., Paigen, B. and Churchill, G. A. (2006). Structural model analysis of multiple quantitative traits. PLoS Genetics 2 e114.
  • Liu, B., de la Fuente, A. and Hoeschele, I. (2008). Gene network inference via structural equation modeling in genetical genomics experiments. Genetics 178 1763–1776.
  • Madigan, D. and Raftery, J. (1994). Model selection and accounting for model uncertainty in graphical models using Occam’s window. J. Amer. Statist. Assoc. 89 1535–1546.
  • Madigan, D. and York, J. (1995). Bayesian graphical models for discrete data. Int. Stat. Rev. 63 215–232.
  • Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Kaufmann, San Mateo, CA.
  • Pearl, J. (2000). Causality: Models, Reasoning and Inference. Cambridge Univ. Press, New York.
  • Riggelsen, C. (2005). MCMC learning of Bayesian network models by Markov blanket decomposition. In Lecture Notes in Computer Science 329–340. Springer, Berlin.
  • Schadt, E. E., Lamb, J., Yang, X., Zhu, J., Edwards, S., Guhathakurta, D., Sieberts, S. K., Monks, S., Reitman, M., Zhang, C., Lum, P. Y., Leonardson, A., Thieringer, R., Metzger, J. M., Yang, L., Castle, J., Zhu, H., Kash, S. F., Drake, T. A., Sachs, A. and Lusis, A. J. (2005). An integrative genomics approach to infer causal associations between gene expression and disease. Nature Genetics 37 710–717.
  • Sen, S. and Churchill, G. A. (2001). A statistical framework for quantitative trait mapping. Genetics 159 371–387.
  • Spirtes, P., Glymour, C. and Scheines, R. (2000). Causation, Prediction and Search, 2nd ed. MIT Press, Cambridge, MA.
  • Verma, T. and Pearl, J. (1990). Equivalence and synthesis of causal models. In Readings in Uncertain Reasoning (G. Shafer and J. Pearl, eds.). Kaufmann, Boston.
  • Wang, S., Yehya, N., Schadt, E. E., Wang, H., Drake, T. A. and Lusis, A. J. (2006). Genetic and genomic analysis of a fat mass trait with complex inheritance reveals marked sex specificity. PLoS Genetics 2 e15.
  • Winrow, C. J., Williams, D. L., Kasarskis, A., Millstein, J., Laposky, A. D., Yang, H. S., Mrazek, K., Zhou, L., Owens, J. R., Radzicki, D., Preuss, F., Schadt, E. E., Shimomura, K., Vitaterna, M. H., Zhang, C., Koblan, K. S., Renger, J. J. and Turek, F. W. (2009). Uncovering the genetic landscape for multiple sleep-wake traits. PLoS ONE 4 e5161.
  • Wright, S. (1934). The method of path coefficients. Ann. Math. Statist. 5 161–215.
  • Zeng, Z. B., Wang, T. and Zou, W. (2005). Modeling quantitative trait loci and interpretation of models. Genetics 169 1711–1725.
  • Zhu, J., Zhang, B., Smith, E. N., Drees, B., Brem, R. B., Kruglyak, L., Bumgarner, R. E. and Schadt, E. E. (2008). Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networks. Nature Genetics 40 854–861.

Supplemental materials

  • Supplementary material: Supplement to “Causal graphical models in systems genetics: A unified framework for joint inference of causal network and genetic architecture for correlated phenotypes”.