The Annals of Applied Statistics

Bayesian detection of embryonic gene expression onset in C. elegans

Jie Hu, Zhongying Zhao, Hari Krishna Yalamanchili, Junwen Wang, Kenny Ye, and Xiaodan Fan

Full-text: Open access


To study how a zygote develops into an embryo with different tissues, large-scale 4D confocal movies of C. elegans embryos have been produced recently by experimental biologists. However, the lack of principled statistical methods for the highly noisy data has hindered the comprehensive analysis of these data sets. We introduced a probabilistic change point model on the cell lineage tree to estimate the embryonic gene expression onset time. A Bayesian approach is used to fit the 4D confocal movies data to the model. Subsequent classification methods are used to decide a model selection threshold and further refine the expression onset time from the branch level to the specific cell time level. Extensive simulations have shown the high accuracy of our method. Its application on real data yields both previously known results and new findings.

Article information

Ann. Appl. Stat., Volume 9, Number 2 (2015), 950-968.

Received: September 2014
Revised: January 2015
First available in Project Euclid: 20 July 2015

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

4D confocal microscopy embryonic onset change point detection Bayesian method


Hu, Jie; Zhao, Zhongying; Yalamanchili, Hari Krishna; Wang, Junwen; Ye, Kenny; Fan, Xiaodan. Bayesian detection of embryonic gene expression onset in C. elegans. Ann. Appl. Stat. 9 (2015), no. 2, 950--968. doi:10.1214/15-AOAS820.

Export citation


  • Andersen, E. C., Lu, X. and Horvitz, H. R. (2006). C. elegans ISWI and NURF301 antagonize an Rb-like pathway in the determination of multiple cell fates. Development 133 2695–2704.
  • Bao, Z., Murray, J. I., Boyle, T., Ooi, S. L., Sandel, M. J. and Waterston, R. H. (2006). Automated cell lineage tracing in caenorhabditis elegans. Proc. Natl. Acad. Sci. USA 103 2707–2712.
  • Gelman, A. and Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences. Statist. Sci. 7 457–472.
  • Good, K., Ciosk, R., Nance, J., Neves, A., Hill, R. J. and Priess, J. R. (2004). The t-box transcription factors tbx-37 and tbx-38 link glp-1/notch signaling to mesoderm induction in C. elegans embryos. Development 131 1967–1968.
  • Guralnik, V. and Srivastava, J. (1999). Event detection from time series data. In KDD’99 Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 17 33–42. ACM, San Diego, CA.
  • Harris, D., Burges, J. C. C., Kaufman, L., Smola, J. A. and Vladimir, N. V. (1997). Support vector regression machines. Adv. Neural Inf. Process. Syst. 9 155–161.
  • Hu, J., Zhao, Z., Yalamanchili, H., Wang J., Ye, K. and Fan, X. (2015). Supplement to “Bayesian detection of embryonic gene expression onset in C. elegans.” DOI:10.1214/15-AOAS820SUPPA, DOI:10.1214/15-AOAS820SUPPB, DOI:10.1214/15-AOAS820SUPPC, DOI:10.1214/15-AOAS820SUPPD, DOI:10.1214/15-AOAS820SUPPE, DOI:10.1214/15-AOAS820SUPPF.
  • Krause, M. (1995). Myod and myogenesis in C. elegans. BioEssays 17 228.
  • Liben-Nowell, D. and Kleinberg, J. (2008). Tracing information flow on a global scale using Internet chain-letter data. Proc. Natl. Acad. Sci. USA 105 4633–4638.
  • Liu, X., Long, F., Peng, H., Aerni, S. J., Jiang, M., Blanco, A. S., Murray, J. I., Preston, E., Mericle, B., Batzoglou, S., Myers, E. W. and Kim, S. K. (2009). Analysis of cell fate from single-cell gene expression profiles in C. elegans. Cell 139 623–633.
  • Long, F., Peng, H., Liu, X., Kim, S. K. and Myers, E. (2009). A 3D digital atlas of C. elegans and its application to single-cell analyses. Nat. Methods 6 667–672.
  • Maduroa, M. F., Hillb, R. J., Heidc, P. J., Smitha, E. D. N., Zhu, J., Priess, J. R. and Rothman, J. H. (2005). Genetic redundancy in endoderm specification within the genus caenorhabditis. Dev. Biol. 284 522.
  • Murray, J. I., Bao, Z., Boyle, T. J., Boeck, M. E., Mericle, B. L., Nicholas, T. J., Zhao, Z., Sandel, M. J. and Waterston, R. H. (2008). Automated analysis of embryonic gene expression with cellular resolution in C. elegans. Nature Methods 5 703–709.
  • Murray, J. I., Boyle, T. J., Preston, E., Vafeados, D., Mericle, B., Weisdepp, P., Zhao, Z., Bao, Z., Boeck, M. and Waterston, R. H. (2012). Multidimensional regulation of gene expression in the C. elegans embryo. Genome Research 22 1282–1294.
  • Perreault, L., Bernier, J., Bobee, B. and Parent, E. (2000). Bayesian change-point analysis in hydrometeorological time series. Journal of Hydrology 235 221–241.
  • Picard, D. (1985). Testing and estimating change-points in time series. Adv. in Appl. Probab. 17 841–867.
  • Spencer, W. C., Zeller, G., Watson, J. D., Henz, S. R., Watkins, K. L., McWhirter, R. D., Petersen, S., Sreedharan, V. T., Widmer, C., Jo, J., Reinke, V., Petrella, L., Strome, S., Stetina, S. E. V., Katz, M., Shaham, S., Ratsch, G. and Miller, D. M. (2011). A spatial and temporal map of C. elegans gene expression. Genome Research 21 325–341.
  • Sulston, J. E., Schierenberg, E., White, J. G. and Thomson, J. N. (1983). The embryonic cell lineage of the nematode Caenorhabditis elegans. Dev. Biol. 100 64–119.
  • Yalamanchili, H. K., Yan, B., Li, M. J., Qin, J., Zhao, Z., Chin, F. Y. and Wang, J. (2013). Dynamic delay gene network inference from high temporal data using gapped local alignment. Bioinformatics 30 377–383.

Supplemental materials