The Annals of Applied Statistics

A statistical analysis of memory CD8 T cell differentiation: An application of a hierarchical state space model to a short time course microarray experiment

Haiyan Wu, Ming Yuan, Susan M. Kaech, and M. Elizabeth Halloran

Full-text: Open access


CD8 T cells are specialized immune cells that play an important role in the regulation of antiviral immune response and the generation of protective immunity. In this paper we investigate the differentiation of memory CD8 T cells in the immune response using a short time course microarray experiment. Structurally, this experiment is similar to many in that it involves measurements taken on independent samples, in one biological group, at a small number of irregularly spaced time points, and exhibiting patterns of temporal nonstationarity. To analyze this CD8 T-cell experiment, we develop a hierarchical state space model so that we can: (1) detect temporally differentially expressed genes, (2) identify the direction of successive changes over time, and (3) assess the magnitude of successive changes over time. We incorporate hidden Markov models into our model to utilize the information embedded in the time series and set up the proposed hierarchical state space model in an empirical Bayes framework to utilize the population information from the large-scale data. Analysis of the CD8 T-cell experiment using the proposed model results in biologically meaningful findings. Temporal patterns involved in the differentiation of memory CD8 T cells are summarized separately and performance of the proposed model is illustrated in a simulation study.

Article information

Ann. Appl. Stat., Volume 1, Number 2 (2007), 442-458.

First available in Project Euclid: 30 November 2007

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Hidden Markov model microarrays gene expression profiles time course empirical Bayes


Wu, Haiyan; Yuan, Ming; Kaech, Susan M.; Halloran, M. Elizabeth. A statistical analysis of memory CD8 T cell differentiation: An application of a hierarchical state space model to a short time course microarray experiment. Ann. Appl. Stat. 1 (2007), no. 2, 442--458. doi:10.1214/07-AOAS118.

Export citation


  • Alter, O., Brown, P. O. and Botstein, D. (2000). Singular value decomposition for genome-wide expression data processing and modeling. Proc. Natl. Acad. Sci. USA 97 10101–10106.
  • Bar-Joseph, Z., Gerber, G., Gifford, D. K., Jaakkola, T. S. and Simon, I. (2002). A new approach to analyzing gene expression time series data. In Annual Conference on Research in Computational Molecular Biology Proceedings of the Sixth Annual International Conference on Computational Biology 39–48. ACM Press, New York.
  • Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Roy. Statist. Soc. Ser. B 57 289–300.
  • Dahl, D. B. and Newton, M. A. (2007). Multiple hypothesis testing by clustering treatment effects. J. Amer. Statist. Assoc 102 517–526.
  • Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. Ser. B 39 1–38.
  • Dudoit, S., Shaffer, J. P. and Boldrick, J. C. (2003). Multiple hypothesis testing in microarray experiments. Statist. Sci. 18 71–103.
  • Efron, B., Tibshirani, R., Storey, J. D. and Tusher, V. (2001). Empirical Bayes analysis of a microarray experiment. J. Amer. Statist. Assoc. 96 1151–1160.
  • Eisen, M. B., Spellman, P. T., Brown, P. O. and Botstein, D. (1998). Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95 14863–14868.
  • Ernst, J., Nau, G. J. and Bar-Joseph, Z. (2005). Clustering short time series gene expression data. Bioinformatics 21 i159–i168.
  • Gollub, J., Ball, C. A., Binkley, G., Demeter, J., Finkelstein, D. B., Hebert, J. M., Hernandez-Boussard, T., Jin, H., Kaloper, M., Matese, J. C., Schroeder, M., Brown, P. O., Botstein, D. and Sherlock, G. (2003). The Stanford microarray database: Data access and quality assessment tools. Nucleic Acids Research 31 94–96.
  • Heard, N. A., Holmes, C. C. and Stephens, D. A. (2006). A quantitative study of gene regulation involved in the immune response of anopheline mosquitoes: An application of Bayesian hierarchical clustering of curves. J. Amer. Statist. Assoc. 101 18–29.
  • Hong, F. and Li, H. (2006). Functional hierarchical models for identifying genes with different time-course expression profiles. Biometrics 62 534–544.
  • Irizarry, R. A., Hobbs, B., Collin, F., Beazer-Barclay, Y. D., Antonellis, K. J., Scherf, U. and Speed, T. P. (2003). Exploration, normalization and summaries of high density oligonucleotide array probe level data. Biostatistics 4 249–264.
  • Kaech, S. M., Hemby, S., Kersh, E. and Ahmed, R. (2002a). Molecular and functional profiling of memory CD8 T cell differentiation. Cell 111 837–851.
  • Kaech, S. M., Wherry, E. J. and Ahmed, R. (2002b). Effector and memory T cell differentiation: Implications for vaccine development. Nature Review Immunology 2 251–262.
  • Kendziorski, C. M., Newton, M. A., Lan, H. and Gould, M. N. (2003). On parametric empirical Bayes methods for comparing multiple groups using replicated gene expression profiles. Statistics in Medicine 22 3899–3914.
  • Klevecz, R. R. (2000). Dynamic architecture of the yeast cell cycle uncovered by wavelet decomposition of expression microarray data. Functional and Integrative Genomics 1 186–192.
  • Liu, J. S. and Chen, R. (1998). Sequential Monte Carlo methods for dynamic systems. J. Amer. Statist. Assoc. 93 1032–1044.
  • Newton, M. A., Kendziorski, C. M., Richmond, C. S., Blattner, F. R. and Tsui, K. W. (2001). On differential variability of expression ratios: Improving statistical inference about gene expression changes from microarray data. J. Comput. Biol. 8 37–52.
  • Newton, M. A., Noueiry, A., Sarkar, D. and Ahlquist, P. (2004). Detecting differential gene expression with a semiparametric hierarchical mixture method. Biostatistics 5 155–176.
  • Park, T., Yi, S.-G., Lee, S., Lee, S. Y., Yoo, D.-H., Ahn, J.-I. and Lee, Y.-S. (2003). Statistical tests for identifying differentially expressed genes in time-course microarray experiments. Bioinformatics 19 694–703.
  • Ramoni, M., Sebastiani, P. and Cohen, P. (2002). Bayesian clustering by dynamics. Machine Learning 47 91–121.
  • Schliep, A., Schönhuth, A. and Steinhoff, C. (2003). Using hidden Markov models to analyze gene expression time course data. Bioinformatics 19 i255–i263.
  • Schliep, A., Steinhoff, C. and Schönhuth, A. (2004). Robust inference of groups in gene expression time-courses using mixtures of HMMs. Bioinformatics 20 i283–i289.
  • Spellman, P. T., Sherlock, G., Zhang, M. Q., Iyer, V. R., Anders, K., Eisen, M. B., Brown, P. O., Botstein, D. and Futcher, B. (1998). Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Molecular Biology of the Cell 9 3273–3297.
  • Storey, J. D. (2002). A direct approach to false discovery rates. J. Roy. Statist. Soc. Ser. B 64 479–498.
  • Storey, J. D. (2003). The positive false discovery rate: A Bayesian interpretation and the q-value. Ann. Statist. 31 2013–2035.
  • Storey, J. D., Xiao, W., Leek, J. T., Tompkins, R. G. and Davis, R. W. (2005). Significance analysis of time course microarray experiments. Proc. Natl. Acad. Sci. USA 102 12837–12842.
  • Tai, Y. C. and Speed, T. P. (2006). A multivariate empirical Bayes statistic for replicated microarray time course data. Ann Statist. 34 2387–2412.
  • Tamayo, P., Slonim, D., Mesirov, J., Zhu, Q., Kitareewan, S., Dmitrovsky, E., Lander, E. S. and Golub, T. R. (1999). Interpreting patterns of gene expression with self-organizing maps: Methods and application to hematopoietic differentiation. Proc. Natl. Acad. Sci. USA 96 2907–2912.
  • Tavazoie, S., Hughes, J. D., Campbell, M. J., Cho, R. J. and Church, G. M. (1999). Systematic determination of genetic network architecture. Nature Genetics 22 281–285.
  • Veiga-Fernandes, H., Walter, U., Bourgeois, C., McLean, A. and Rocha, B. (2000). Response of naïve and memory CD8+ T cells to antigen stimulation in vivo. Nature Immunology 1 47–53.
  • Vinayagam, A., Pugalenthi, G., Rajesh, R. and Sowdhamini, R. (2004). SDBASE: A consortium of native and modelled disulphide bonds in proteins. Nucleic Acids Research 32 200–202.
  • Wall, M. E., Dyck, P. A. and Brettin, T. S. (2001). SVDMAN–-singular value decomposition analysis of microarray data. Bioinformatics 17 566–568.
  • Wu, H. (2007). Hierarchical analysis of microarray experiments with applications to the study of CD8 T cell immune respones. Ph.D. thesis, Emory Univ.
  • Yuan, M. and Kendziorski, C. (2006). Hidden Markov models for microarray time course data in multiple biological conditions (with discussion). J. Amer. Statist. Assoc. 101 1323–1340.
  • Zhou, C. and Wakefield, J. (2006). A Bayesian mixture model for partitioning gene expression data. Biometrics 62 515–525.