The Annals of Applied Statistics

A Bayesian predictive model for imaging genetics with application to schizophrenia

Thierry Chekouo, Francesco C. Stingo, Michele Guindani, and Kim-Anh Do

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


Imaging genetics has rapidly emerged as a promising approach for investigating the genetic determinants of brain mechanisms that underlie an individual’s behavior or psychiatric condition. In particular, for early detection and targeted treatment of schizophrenia, it is of high clinical relevance to identify genetic variants and imaging-based biomarkers that can be used as diagnostic markers, in addition to commonly used symptom-based assessments. By combining single-nucleotide polymorphism (SNP) arrays and functional magnetic resonance imaging (fMRI), we propose an integrative Bayesian risk prediction model that allows us to discriminate between individuals with schizophrenia and healthy controls, based on a sparse set of discriminatory regions of interest (ROIs) and SNPs. Inference on a regulatory network between SNPs and ROI intensities (ROI–SNP network) is used in a single modeling framework to inform the selection of the discriminatory ROIs and SNPs. We use simulation studies to assess the performance of our method and apply it to data collected from individuals with schizophrenia and healthy controls. We found our approach to outperform competing methods that do not link the ROI–SNP network to the selection of discriminatory markers.

Article information

Ann. Appl. Stat., Volume 10, Number 3 (2016), 1547-1571.

Received: March 2015
Revised: March 2016
First available in Project Euclid: 28 September 2016

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Imaging genetics fMRI data integration Bayesian variable selection Markov random field nonlocal prior


Chekouo, Thierry; Stingo, Francesco C.; Guindani, Michele; Do, Kim-Anh. A Bayesian predictive model for imaging genetics with application to schizophrenia. Ann. Appl. Stat. 10 (2016), no. 3, 1547--1571. doi:10.1214/16-AOAS948.

Export citation


  • Albert, J. H. and Chib, S. (1993). Bayesian analysis of binary and polychotomous response data. J. Amer. Statist. Assoc. 88 669–679.
  • Atchadé, Y. F., Lartillot, N. and Robert, C. (2013). Bayesian computation for statistical models with intractable normalizing constants. Braz. J. Probab. Stat. 27 416–436.
  • Batmanghelich, N., Dalca, A., Sabuncu, M. and Golland, P. (2013). Joint modeling of imaging and genetics. In Information Processing in Medical Imaging (J. C. Gee, S. Joshi, K. Pohl, W. M. Wells and L. Zellei, eds.). Lecture Notes in Computer Science 7917 766–777. Springer, Berlin.
  • Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems. J. Roy. Statist. Soc. Ser. B 36 192–236.
  • Bowman, F. D. (2014). Brain imaging analysis. Annu. Rev. Stat. Appl. 1 61–85.
  • Calhoun, V. D. and Hugdahl, K. (2012). Cognition and neuroimaging in schizophrenia. Front. Human Neurosci. 6 276.
  • Calhoun, V. D., Liu, J. and Adali, T. (2009). A review of group ICA for fMRI data and ICA for joint inference of imaging, genetic, and ERP data. NeuroImage 45 S163–S172.
  • Cannon, T. D. and Keller, M. C. (2006). Endophenotypes in the genetic analyses of mental disorders. Annu. Rev. Clin. Psychol. 2 267–290.
  • Cao, H., Duan, J., Lin, D., Calhoun, V. and Wang, Y.-P. (2013). Integrating fMRI and SNP data for biomarker identification for schizophrenia with a sparse representation based variable selection method. BMC Medical Genomics 6 Suppl 3 S2.
  • Chekouo, T., Stingo, F. C., Guindani, M. and Do, K. (2016). Supplement to “A Bayesian predictive model for imaging genetics with application to schizophrenia.” DOI:10.1214/16-AOAS948SUPP.
  • Chen, J., Calhoun, V. D., Pearlson, G. D., Ehrlich, S., Turner, J. A., Ho, B.-C., Wassink, T. H., Michael, A. and Liu, J. (2012). Multifaceted genomic risk for brain function in schizophrenia. NeuroImage 61 866–875.
  • Chi, E. C., Allen, G. I., Zhou, H., Kohannim, O., Lange, K. and Thompson, P. M. (2013). Imaging genetics via sparse canonical correlation analysis. In Biomedical Imaging (ISBI), 2013 IEEE 10th International Symposium on 740–743.
  • Damaraju, E., Allen, E. A., Belger, A., Ford, J. M., McEwen, S., Mathalon, D. H., Calhoun, V. D. et al. (2014). Dynamic functional connectivity analysis reveals transient states of dysconnectivity in schizophrenia. NeuroImage: Clinical 5 298–308.
  • Deng, L. and Yu, D. (2013). Deep learning: Methods and applications. Found. Trends Signal Process. 7 197–391.
  • Dettling, M. and Bühlmann, P. (2003). Boosting for tumor classification with gene expression data. Bioinformatics 19 1061–1069.
  • Erhardt, E. B., Rachakonda, S., Bedrick, E. J., Allen, E. A., Adali, T. and Calhoun, V. D. (2011). Comparison of multi-subject ICA methods for analysis of fMRI data. Hum. Brain Mapp. 32 2075–2095.
  • Filipovych, R., Resnick, S. M. and Davatzikos, C. (2011). Multi-kernel classification for integration of clinical and imaging data: Application to prediction of cognitive decline in older adults machine learning in medical imaging (K. Suzuki, F. Wang, D. Shen and P. Yan, eds.). Lecture Notes in Computer Science 7009 26–34. Springer, Berlin.
  • Floch, É. L., Guillemot, V., Frouin, V., Pinel, P., Lalanne, C., Trinchera, L., Tenenhaus, A., Moreno, A., Zilbovicius, M., Bourgeron, T., Dehaene, S., Thirion, B., Poline, J.-B. and Duchesnay, É. (2012). Significant correlation between a set of genetic polymorphisms and a functional brain network revealed by feature selection and sparse Partial Least Squares. NeuroImage 63 11–24.
  • Fornito, A., Zalesky, A., Pantelis, C. and Bullmore, E. T. (2012). Schizophrenia, neuroimaging and connectomics. NeuroImage 62 2296–2314.
  • Friedman, J., Hastie, T. and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9 432–441.
  • Friedman, J. H., Hastie, T. and Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33 1–22.
  • Fujiwara, H., Namiki, C., Hirao, K., Miyata, J., Shimizu, M., Fukuyama, H., Sawamoto, N., Hayashi, T. and Murai, T. (2007). Anterior and posterior cingulum abnormalities and their association with psychopathology in schizophrenia: A diffusion tensor imaging study. Schizophr. Res. 95 215–222.
  • George, E. I. and Mcculloch, R. E. (1997). Approaches for Bayesian variable selection. Statist. Sinica 7 339–374.
  • Glahn, D. C., Laird, A. R., Ellison-Wright, I., Thelen, S. M., Robinson, J. L., Lancaster, J. L., Bullmore, E. and Fox, P. T. (2008). Meta-analysis of gray matter anomalies in schizophrenia: Application of anatomic likelihood estimation and network analysis. Biological Psychiatry 64 774–781.
  • Goldsmith, J., Huang, L. and Crainiceanu, C. M. (2014). Smooth scalar-on-image regression via spatial Bayesian variable selection. J. Comput. Graph. Statist. 23 46–64.
  • Goldsmith, J., Crainiceanu, C. M., Caffo, B. and Reich, D. (2012). Longitudinal penalized functional regression for cognitive outcomes on neuronal tract measurements. J. R. Stat. Soc. Ser. C. Appl. Stat. 61 453–469.
  • Gollub, R. L., Shoemaker, J. M., King, M. D., White, T., Ehrlich, S., Sponheim, S. R., Clark, V. P., Turner, J. A., Mueller, B. A., Magnotta, V. et al. (2013). The MCIC collection: A shared repository of multi-modal, multi-site brain image data from a clinical investigation of schizophrenia. Neuroinformatics 11 367–388.
  • Hardoon, D. R., Ettinger, U., Mourão-Miranda, J., Antonova, E., Collier, D., Kumari, V., Williams, S. C. R. and Brammer, M. (2009). Correlation-based multivariate analysis of genetic influence on brain volume. Neurosci. Lett. 450 281–286.
  • Hariri, A. R. and Weinberger, D. R. (2003). Imaging genomics. Br. Med. Bull. 65 259–270.
  • Ikeda, M., Yamanouchi, Y., Kinoshita, Y., Kitajima, T., Yoshimura, R., Hashimoto, S., O’Donovan, M. C., Nakamura, J., Ozaki, N. and Iwata, N. (2008). Variants of dopamine and serotonin candidate genes as predictors of response to risperidone treatment in first-episode schizophrenia. Pharmacogenomics 9 1437–1443.
  • Jacob, A. (2013). Limitations of clinical psychiatric diagnostic measurements. J. Neurol. Disord. 2.
  • Johnson, V. E. and Rossell, D. (2010). On the use of non-local prior densities in Bayesian hypothesis tests. J. R. Stat. Soc. Ser. B Stat. Methodol. 72 143–170.
  • Johnson, V. E. and Rossell, D. (2012). Bayesian model selection in high-dimensional settings. J. Amer. Statist. Assoc. 107 649–660.
  • Joo, E. J., Lee, K. Y., Jeong, S. H., Roh, M. S., Kim, S. H., Ahn, Y. M. and Kim, Y. S. (2009). AKT1 gene polymorphisms and obstetric complications in the patients with schizophrenia. Psychiatry Investigation 6 102–107.
  • Kim, J. Y., Liu, C. Y., Zhang, F., Duan, X., Wen, Z., Song, J., Feighery, E., Lu, B., Rujescu, D., Clair, D. S., Christian, K., Callicott, J. H., Weinberger, D. R., Song, H. and li Ming, G. (2012). Interplay between DISC1 and GABA signaling regulates neurogenesis in mice and risk for schizophrenia. Cell 148 1051–1064.
  • LeCun, Y., Bengio, Y. and Hinton, G. (2015). Deep learning. Nature 521 436–444.
  • Lencz, T., Morgan, T. V., Athanasiou, M., Dain, B., Reed, C. R., Kane, J. M., Kucherlapati, R. and Malhotra, A. K. (2007). Converging evidence for a pseudoautosomal cytokine receptor gene locus in schizophrenia. Mol. Psychiatry 12 572–580.
  • Levitt, J. J., McCarley, R. W., Nestor, P. G., Petrescu, C., Donnino, R., Hirayasu, Y., Kikinis, R., Jolesz, F. A. and Shenton, M. E. (1999). Quantitative volumetric MRI study of the cerebellum and vermis in schizophrenia: Clinical and cognitive correlates. Am. J. Psychiatr. 156 1105–1107.
  • Li, F. and Zhang, N. R. (2010). Bayesian variable selection in structured high-dimensional covariate spaces with applications in genomics. J. Amer. Statist. Assoc. 105 1202–1214.
  • Li, F., Zhang, T., Wang, Q., Gonzalez, M. Z., Maresh, E. L. and Coan, J. (2015). Spatial Bayesian variable selection and grouping in high-dimensional scalar-on-image regressions. Ann. Appl. Stat. 9 687–713.
  • Lin, D., Calhoun, V. D. and Wang, Y.-P. (2014). Correspondence between fMRI and SNP data by group sparse canonical correlation analysis. Med. Image Anal. 18 891–902.
  • Lin, J.-A., Zhu, H., Mihye, A., Sun, W., Ibrahim, J. G. and for the Alzheimer’s Neuroimaging Initiative (2014). Functional-mixed effects models for candidate genetic mapping in imaging genetic studies. Genet. Epidemiol. 38 680–691.
  • Lindquist, M. A. (2008). The statistical analysis of fMRI data. Statist. Sci. 23 439–464.
  • Liu, J. and Calhoun, V. D. (2014). A review of multivariate analyses in imaging genetics. Front. Neuroinform. 8 29.
  • Liu, J., Pearlson, G., Windemuth, A., Ruano, G., Perrone-Bizzozero, N. I. and Calhoun, V. (2009). Combining fMRI and SNP data to investigate connections between brain function and genetics using parallel ICA. Hum. Brain Mapp. 30 241–255.
  • Lo, W.-S., Lau, C.-F., Xuan, Z., Chan, C.-F., Feng, G.-Y., He, L., Cao, Z.-C., Liu, H., Luan, Q.-M. and Xue, H. (2004). Association of SNPs and haplotypes in GABAA receptor beta2 gene with schizophrenia. Mol. Psychiatry 9 603–608.
  • Madigan, D. and Raftery, A. E. (1994). Model selection and accounting for model uncertainty in graphical models using Occam’s window. J. Amer. Statist. Assoc. 89 1535–1546.
  • Meda, S. A., Narayanan, B., Liu, J., Perrone-Bizzozero, N. I., Stevens, M. C., Calhoun, V. D., Glahn, D. C., Shen, L., Risacher, S. L., Saykin, A. J. and Pearlson, G. D. (2012). A large scale multivariate parallel ICA method reveals novel imaging–genetic relationships for Alzheimer’s disease in the ADNI cohort. NeuroImage 60 1608–1621.
  • Meyer-Lindenberg, A. (2012). The future of fMRI and genetics research. NeuroImage 62 1286–1292.
  • Müller, V. I., Cieslik, E. C., Laird, A. R., Fox, P. T. and Eickhoff, S. B. (2013). Dysregulated left inferior parietal activity in schizophrenia and depression: Functional connectivity and characterization. Front. Human Neurosci. 7 268.
  • Okugawa, G., Sedvall, G. C. and Agartz, I. (2003). Smaller cerebellar vermis but not hemisphere volumes in patients with chronic schizophrenia. Am. J. Psychiatr. 160 1614–1617.
  • Potkin, S. G., Turner, J. A., Fallon, J. A., Lakatos, A., Keator, D. B., Guffanti, G. and Macciardi, F. (2009). Gene discovery through imaging genetics: Identification of two novel genes associated with schizophrenia. Mol. Psychiatry 14 416–428.
  • Potkin, S. G., van Erp, T. G. M., Ling, S., Macciardi, F. and Xie, X. (2015). Identifying Unanticipated Genes and Mechanisms in Serious Mental Illness: GWAS Based Imaging Genetics Strategies. 209. Oxford Univ. Press, London.
  • Ripley, B. D. (1996). Pattern Recognition and Neural Networks. Cambridge Univ. Press, Cambridge.
  • Saetre, P., Agartz, I., Franciscis, A. D., Lundmark, P., Djurovic, S., Kahler, A., Andreassen, O. A., Jakobsen, K. D., Rasmussen, H. B., Werge, T., Hall, H., Terenius, L. and Jonsson, E. G. (2008). Association between a disrupted-in-schizophrenia 1 (DISC1) single nucleotide polymorphism and schizophrenia in a combined Scandinavian case-control sample. Schizophr. Res. 106 237–241.
  • Scott, J. G. and Berger, J. O. (2010). Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem. Ann. Statist. 38 2587–2619.
  • Sha, N., Vannucci, M., Tadesse, M. G., Brown, P. J., Dragoni, I., Davies, N., Roberts, T. C., Contestabile, A., Salmon, M., Buckley, C. and Falciani, F. (2004). Bayesian variable selection in multinomial probit models to identify molecular signatures of disease stage. Biometrics 60 812–828.
  • Shahbaba, B., Shachaf, C. M. and Yu, Z. (2012). A pathway analysis method for genome-wide association studies. Stat. Med. 31 988–1000.
  • Simon, N., Friedman, J., Hastie, T. and Tibshirani, R. (2013). A sparse-group lasso. J. Comput. Graph. Statist. 22 231–245.
  • Sonnenburg, S., Rätsch, G., Schäfer, C. and Schölkopf, B. (2006). Large scale multiple kernel learning. J. Mach. Learn. Res. 7 1531–1565.
  • Stingo, F. C., Vannucci, M. and Downey, G. (2012). Bayesian wavelet-based curve classification via discriminant analysis with Markov random tree priors. Statist. Sinica 22 465–488.
  • Stingo, F. C., Chen, Y. A., Vannucci, M., Barrier, M. and Mirkes, P. E. (2010). A Bayesian graphical modeling approach to microRNA regulatory network inference. Ann. Appl. Stat. 4 2024–2048.
  • Stingo, F. C., Chen, Y. A., Tadesse, M. G. and Vannucci, M. (2011). Incorporating biological information into linear models: A Bayesian approach to the selection of pathways and genes. Ann. Appl. Stat. 5 1978–2002.
  • Stingo, F. C., Guindani, M., Vannucci, M. and Calhoun, V. D. (2013). An integrative Bayesian modeling approach to imaging genetics. J. Amer. Statist. Assoc. 108 876–891.
  • Swartz, M. D., Yu, R. K. and Shete, S. (2008). Finding factors influencing risk: Comparing Bayesian stochastic search and standard variable selection methods applied to logistic regression models of cases and controls. Stat. Med. 27 6158–6174.
  • Tzourio-Mazoyer, N., Landeau, B., Papathanassiou, D., Crivello, F., Etard, O., Delcroix, N., Mazoyer, B. and Joliot, M. (2002). Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain. NeuroImage 15 273–289.
  • Vounou, M., Nichols, T. E. and Montana, G. (2010). Discovering genetic associations with high-dimensional neuroimaging phenotypes: A sparse reduced-rank regression approach. NeuroImage 53 1147–1159.
  • Vounou, M., Janousova, E., Wolz, R., Stein, J. L., Thompson, P. M., Rueckert, D. and Montana, G. (2012). Sparse reduced-rank regression detects genetic associations with voxel-wise longitudinal phenotypes in Alzheimer’s disease. NeuroImage 60 700–716.
  • Waltz, J. A., Schweitzer, J. B., Gold, J. M., Kurup, P. K., Ross, T. J., Salmeron, B. J., Rose, E. J., McClure, S. M. and Stein, E. A. (2009). Patients with schizophrenia have a reduced neural response to both unpredictable and predictable primary reinforcers. Neuropsychopharmacology 34 1567–1577.
  • Wang, H., Nie, F., Huang, H., Risacher, S. L., Saykin, A. J., Shen, L. and the Alzheimer’s Disease Neuroimaging Initiative (2012a). Identifying disease sensitive and quantitative trait-relevant biomarkers from multidimensional heterogeneous imaging genetics data via sparse multimodal multitask learning. Bioinformatics 28 i127–i136.
  • Wang, H., Nie, F., Huang, H., Kim, S., Nho, K., Risacher, S. L., Saykin, A. J., Shen, L. and the Alzheimer’s Disease Neuroimaging Initiative (2012b). Identifying quantitative trait loci via group-sparse multitask regression and feature selection: An imaging genetics study of the ADNI cohort. Bioinformatics 28 229–237.
  • Weiss, K. M. (1989). Advantages of abandoning symptom-based diagnostic systems of research in schizophrenia. Am. J. Orthopsychiatr. 59 324–330.
  • Wu, L., Calhoun, V. D., Jung, R. E. and Caprihan, A. (2015). Connectivity-based whole brain dual parcellation by group ICA reveals tract structures and decreased connectivity in schizophrenia. Hum. Brain Mapp. 36 4681–4701.
  • Xu, M.-Q., Xing, Q.-H., Zheng, Y.-L., Li, S., Gao, J.-J., He, G., Guo, T.-W., Feng, G.-Y., Xu, F. and He, L. (2007). Association of AKT1 gene polymorphisms with risk of schizophrenia and with response to antipsychotics in the Chinese population. J. Clin. Psychiatry 68 1358–1367.
  • Yang, H., Liu, J., Sui, J., Pearlson, G. and Calhoun, V. D. (2010). A hybrid machine learning method for fusing fMRI and genetic data: Combining both improves classification of schizophrenia. Front. Human Neurosci. 4 1–9.
  • Yu, Z., Chen, J., Shi, H., Stoeber, G., Tsang, S.-Y. and Xue, H. (2006). Analysis of GABRB2 association with schizophrenia in German population with DNA sequencing and one-label extension method for SNP genotyping. Clin. Biochem. 39 210–218.
  • Zhang, L., Guindani, M. and Vannucci, M. (2015). Bayesian models for functional magnetic resonance imaging data analysis. Wiley Interdiscip. Rev.: Comput. Stat. 7 21–41.
  • Zhang, Z., Huang, H. and Shen, D. (2014). Integrative analysis of multi-dimensional imaging genomics data for Alzheimer’s disease prediction. Front. Aging Neurosci. 6 1–9.
  • Zhang, T., Wiesel, A. and Greco, M. S. (2013). Multivariate generalized Gaussian distribution: Convexity and graphical models. IEEE Trans. Signal Process. 61 4141–4148.
  • Zhang, H. H., Ahn, J., Lin, X. and Park, C. (2006). Gene selection using support vector machines with non-convex penalty. Bioinformatics 22 88–95.
  • Zhu, H., Khondker, Z., Lu, Z. and Ibrahim, J. G. (2014). Bayesian generalized low rank regression models for neuroimaging phenotypes and genetic markers. J. Amer. Statist. Assoc. 109 977–990.
  • Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B Stat. Methodol. 67 301–320.

Supplemental materials

  • Supplement to “A Bayesian predictive model for imaging genetics with application to schizophrenia”. The supplementary material [Chekouo et al. (2016)] contains details about posterior computation, hyperparameter settings and sensitivity, data preprocessing, and additional simulation studies and data analyses. The companion MATLAB code is available on The Annals of Applied Statistics website.