The Annals of Applied Statistics

A Markov random field-based approach to characterizing human brain development using spatial–temporal transcriptome data

Zhixiang Lin, Stephan J. Sanders, Mingfeng Li, Nenad Sestan, Matthew W. State, and Hongyu Zhao

Full-text: Open access

Abstract

Human neurodevelopment is a highly regulated biological process. In this article, we study the dynamic changes of neurodevelopment through the analysis of human brain microarray data, sampled from 16 brain regions in 15 time periods of neurodevelopment. We develop a two-step inferential procedure to identify expressed and unexpressed genes and to detect differentially expressed genes between adjacent time periods. Markov Random Field (MRF) models are used to efficiently utilize the information embedded in brain region similarity and temporal dependency in our approach. We develop and implement a Monte Carlo expectation–maximization (MCEM) algorithm to estimate the model parameters. Simulation studies suggest that our approach achieves lower misclassification error and potential gain in power compared with models not incorporating spatial similarity and temporal dependency.

Article information

Source
Ann. Appl. Stat., Volume 9, Number 1 (2015), 429-451.

Dates
First available in Project Euclid: 28 April 2015

Permanent link to this document
https://projecteuclid.org/euclid.aoas/1430226099

Digital Object Identifier
doi:10.1214/14-AOAS802

Mathematical Reviews number (MathSciNet)
MR3341122

Zentralblatt MATH identifier
06446575

Keywords
Markov Random Field model spatial and temporal data neurodevelopment microarray Monte Carlo expectation–maximization algorithm gene expression differential expression

Citation

Lin, Zhixiang; Sanders, Stephan J.; Li, Mingfeng; Sestan, Nenad; State, Matthew W.; Zhao, Hongyu. A Markov random field-based approach to characterizing human brain development using spatial–temporal transcriptome data. Ann. Appl. Stat. 9 (2015), no. 1, 429--451. doi:10.1214/14-AOAS802. https://projecteuclid.org/euclid.aoas/1430226099


Export citation

References

  • Amaral, D. G., Schumann, C. M. and Nordahl, C. W. (2008). Neuroanatomy of autism. Trends Neurosci. 31 137–145.
  • American Psychiatric Association (2000). Diagnostic and Statistical Manual of Mental Disorders: DSM-IV-TR®. American Psychiatric Publishing, Arlington, VA.
  • Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems. J. Roy. Statist. Soc. Ser. B 36 192–236.
  • Besag, J. (1986). On the statistical analysis of dirty pictures. J. Roy. Statist. Soc. Ser. B 48 259–302.
  • Chen, M., Cho, J. and Zhao, H. (2011). Incorporating biological pathways via a Markov random field model in genome-wide association studies. PLoS Genet. 7 e1001353.
  • Efron, B. (2004). Large-scale simultaneous hypothesis testing. J. Amer. Statist. Assoc. 99 96–104.
  • Geschwind, D. H. and Levitt, P. (2007). Autism spectrum disorders: Developmental disconnection syndromes. Curr. Opin. Neurobiol. 17 103–111.
  • Hong, F. and Li, H. (2006). Functional hierarchical models for identifying genes with different time-course expression profiles. Biometrics 62 534–544.
  • Huang, D., Sherman, B. T., Lempicki, R. A. et alet al. (2008). Systematic and integrative analysis of large gene lists using David bioinformatics resources. Nat. Protoc. 4 44–57.
  • Iossifov, I., Ronemus, M., Levy, D., Wang, Z., Hakker, I., Rosenbaum, J., Yamrom, B., Lee, Y.-h., Narzisi, G., Leotta, A. et alet al. (2012). De novo gene disruptions in children on the autistic spectrum. Neuron 74 285–299.
  • Johnson, M. B., Kawasawa, Y. I., Mason, C. E., Krsnik, Ž., Coppola, G., Bogdanović, D., Geschwind, D. H., Mane, S. M., Sestan, N. et alet al. (2009). Functional and evolutionary insights into human brain development through global transcriptome analysis. Neuron 62 494–509.
  • Kang, H. J., Kawasawa, Y. I., Cheng, F., Zhu, Y., Xu, X., Li, M., Sousa, A. M., Pletikos, M., Meyer, K. A., Sedmak, G. et alet al. (2011). Spatio-temporal transcriptome of the human brain. Nature 478 483–489.
  • Kong, A., Frigge, M. L., Masson, G., Besenbacher, S., Sulem, P., Magnusson, G., Gudjonsson, S. A., Sigurdsson, A., Jonasdottir, A., Jonasdottir, A. et alet al. (2012). Rate of de novo mutations and the importance of father’s age to disease risk. Nature 488 471–475.
  • Levine, R. A. and Casella, G. (2001). Implementations of the Monte Carlo EM algorithm. J. Comput. Graph. Statist. 10 422–439.
  • Li, C., Wei, Z. and Li, H. (2010). Network-based empirical Bayes methods for linear models with applications to genomic data. J. Biopharm. Statist. 20 209–222.
  • Li, H., Wei, Z. and Maris, J. (2010). A hidden Markov random field model for genome-wide association studies. Biostatistics 11 139–150.
  • Lin, Z., Sanders, S. J., Li, M., Sestan, N., State, M. W. and Zhao, H. (2015). Supplement to “A Markov random field-based approach to characterizing human brain development using spatial–temporal transcriptome data.” DOI:10.1214/14-AOAS802SUPP.
  • Liu, X. and Yang, M. C. (2009). Identifying temporally differentially expressed genes through functional principal components analysis. Biostatistics 10 667–679.
  • Neale, B. M., Kou, Y., Liu, L., Ma’ayan, A., Samocha, K. E., Sabo, A., Lin, C.-F., Stevens, C., Wang, L.-S., Makarov, V. et alet al. (2012). Patterns and rates of exonic de novo mutations in autism spectrum disorders. Nature 485 242–245.
  • Newton, M. A., Kendziorski, C. M., Richmond, C. S., Blattner, F. R. and Tsui, K.-W. (2001). On differential variability of expression ratios: Improving statistical inference about gene expression changes from microarray data. J. Comput. Biol. 8 37–52.
  • O’Roak, B. J., Deriziotis, P., Lee, C., Vives, L., Schwartz, J. J., Girirajan, S., Karakoc, E., MacKenzie, A. P., Ng, S. B., Baker, C. et alet al. (2011). Exome sequencing in sporadic autism spectrum disorders identifies severe de novo mutations. Nat. Genet. 43 585–589.
  • O’Roak, B. J., Vives, L., Girirajan, S., Karakoc, E., Krumm, N., Coe, B. P., Levy, R., Ko, A., Lee, C., Smith, J. D. et alet al. (2012). Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations. Nature 485 246–250.
  • Sanders, S. J., Murtha, M. T., Gupta, A. R., Murdoch, J. D., Raubeson, M. J., Willsey, A. J., Ercan-Sencicek, A. G., DiLullo, N. M., Parikshak, N. N., Stein, J. L. et alet al. (2012). De novo mutations revealed by whole-exome sequencing are strongly associated with autism. Nature 485 237–241.
  • Sestan, N. et alet al. (2012). The emerging biology of autism spectrum disorders. Science (New York, NY) 337 1301.
  • Sherman, B. T., Lempicki, R. A. et alet al. (2009). Bioinformatics enrichment tools: Paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res. 37 1–13.
  • Storey, J. D., Xiao, W., Leek, J. T., Tompkins, R. G. and Davis, R. W. (2005). Significance analysis of time course microarray experiments. Proc. Natl. Acad. Sci. USA 102 12837–12842.
  • Tai, Y. C. and Speed, T. P. (2006). A multivariate empirical Bayes statistic for replicated microarray time course data. Ann. Statist. 34 2387–2412.
  • Voineagu, I., Wang, X., Johnston, P., Lowe, J. K., Tian, Y., Horvath, S., Mill, J., Cantor, R. M., Blencowe, B. J. and Geschwind, D. H. (2011). Transcriptomic analysis of autistic brain reveals convergent molecular pathology. Nature 474 380–384.
  • Walsh, C. A., Morrow, E. M. and Rubenstein, J. L. (2008). Autism and brain development. Cell 135 396–400.
  • Wei, Z. and Li, H. (2007). A Markov random field model for network-based analysis of genomic data. Bioinformatics 23 1537–1544.
  • Wei, Z. and Li, H. (2008). A hidden spatial–temporal Markov random field model for network-based analysis of time course gene expression data. Ann. Appl. Stat. 2 408–429.
  • Wei, G. C. and Tanner, M. A. (1990). A Monte Carlo implementation of the EM algorithm and the poor man’s data augmentation algorithms. J. Amer. Statist. Assoc. 85 699–704.
  • Willsey, A. J., Sanders, S. J., Li, M., Dong, S., Tebbenkamp, A. T., Muhle, R. A., Reilly, S. K., Lin, L., Fertuzinhos, S., Miller, J. A. et alet al. (2013). Coexpression networks implicate human midfetal deep cortical projection neurons in the pathogenesis of autism. Cell 155 997–1007.
  • Wu, H., Yuan, M., Kaech, S. M. and Halloran, M. E. (2007). A statistical analysis of memory CD8 T cell differentiation: An application of a hierarchical state space model to a short time course microarray experiment. Ann. Appl. Stat. 1 442–458.
  • Yuan, M. and Kendziorski, C. (2006). Hidden Markov models for microarray time course data in multiple biological conditions. J. Amer. Statist. Assoc. 101 1323–1332.

Supplemental materials

  • Supplement to "A Markov random field-based approach to characterizing human brain development using spatial-temporal transcriptome data".: Section 1: More information on the brain regions. Section 2: Spatial and temporal similarity. Section 3: Microarray quality control procedures. Section 4: Model fit and the robustness of the Gaussian mixture model. Section 5: Diagnosis for the MCEM algorithm. Section 6: Gene Ontology (GO) enrichment analysis. Section 7: High confidence ASD genes. Section 8: Supplementary data for Section 4.1. Section 9: Supplementary data for Section 4.2. Section 10: Comparison between the ICM algorithm and the MCEM algorithm.