The Annals of Applied Statistics

SCALPEL: Extracting neurons from calcium imaging data

Ashley Petersen, Noah Simon, and Daniela Witten

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


In the past few years, new technologies in the field of neuroscience have made it possible to simultaneously image activity in large populations of neurons at cellular resolution in behaving animals. In mid-2016, a huge repository of this so-called “calcium imaging” data was made publicly available. The availability of this large-scale data resource opens the door to a host of scientific questions for which new statistical methods must be developed.

In this paper we consider the first step in the analysis of calcium imaging data—namely, identifying the neurons in a calcium imaging video. We propose a dictionary learning approach for this task. First, we perform image segmentation to develop a dictionary containing a huge number of candidate neurons. Next, we refine the dictionary using clustering. Finally, we apply the dictionary to select neurons and estimate their corresponding activity over time, using a sparse group lasso optimization problem. We assess performance on simulated calcium imaging data and apply our proposal to three calcium imaging data sets.

Our proposed approach is implemented in the R package scalpel, which is available on CRAN.

Article information

Ann. Appl. Stat., Volume 12, Number 4 (2018), 2430-2456.

Received: March 2017
Revised: December 2017
First available in Project Euclid: 13 November 2018

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Calcium imaging cell sorting dictionary learning neuron identification segmentation clustering sparse group lasso


Petersen, Ashley; Simon, Noah; Witten, Daniela. SCALPEL: Extracting neurons from calcium imaging data. Ann. Appl. Stat. 12 (2018), no. 4, 2430--2456. doi:10.1214/18-AOAS1159.

Export citation


  • Ahrens, M. B., Orger, M. B., Robson, D. N., Li, J. M. and Keller, P. J. (2013). Whole-brain functional imaging at cellular resolution using light-sheet microscopy. Nat. Methods 10 413–420.
  • Apthorpe, N., Riordan, A., Aguilar, R., Homann, J., Gu, Y., Tank, D. and Seung, H. S. (2016). Automatic neuron detection in calcium imaging data using convolutional networks. In Advances in Neural Information Processing Systems 3270–3278.
  • Bien, J. and Tibshirani, R. (2011). Hierarchical clustering with prototypes via minimax linkage. J. Amer. Statist. Assoc. 106 1075–1084.
  • Bien, J. and Tibshirani, R. (2015). protoclust: Hierarchical Clustering with Prototypes. Available at R package version 1.5.
  • Chen, T.-W., Wardill, T. J., Sun, Y., Pulver, S. R., Renninger, S. L., Baohan, A., Schreiter, E. R., Kerr, R. A., Orger, M. B., Jayaraman, V., Looger, L. L., Svoboda, K. and Kim, D. S. (2013). Ultrasensitive fluorescent proteins for imaging neuronal activity. Nature 499 295–300.
  • Diego, F. and Hamprecht, F. A. (2013). Learning multi-level sparse representations. In Advances in Neural Information Processing Systems 818–826.
  • Diego, F. and Hamprecht, F. A. (2014). Sparse space–time deconvolution for calcium image analysis. In Advances in Neural Information Processing Systems 64–72.
  • Diego, F., Reichinnek, S., Both, M., Hamprecht, F. et al. (2013). Automated identification of neuronal activity from calcium imaging by sparse dictionary learning. In Biomedical Imaging (ISBI), 2013 IEEE 10th International Symposium on 1058–1061. IEEE Press, New York.
  • Dombeck, D. A., Khabbaz, A. N., Collman, F., Adelman, T. L. and Tank, D. W. (2007). Imaging large-scale neural activity with cellular resolution in awake, mobile mice. Neuron 56 43–57.
  • Friedrich, J., Zhou, P. and Paninski, L. (2017). Fast online deconvolution of calcium imaging data. PLoS Comput. Biol. 13 e1005423.
  • Friedrich, J., Soudry, D., Mu, Y., Freeman, J., Ahres, M. and Paninski, L. (2015). Fast constrained non-negative matrix factorization for whole-brain calcium imaging data. In NIPS Workshop on Statistical Methods for Understanding Neural Systems.
  • Gower, J. C. (2006). Similarity, dissimilarity and distance, measures of. Encyclopedia of Statistical Sciences.
  • Grienberger, C. and Konnerth, A. (2012). Imaging calcium in neurons. Neuron 73 862–885.
  • Haeffele, B., Young, E. and Vidal, R. (2014). Structured low-rank matrix factorization: Optimality, algorithm, and applications to image processing. In Proceedings of the 31st International Conference on Machine Learning (ICML-14) 2007–2015.
  • Hastie, T., Tibshirani, R. and Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed. Springer, New York.
  • Helmchen, F. and Denk, W. (2005). Deep tissue two-photon microscopy. Nat. Methods 2 932–940.
  • Huber, D., Gutnisky, D. A., Peron, S., O’connor, D. H., Wiegert, J. S., Tian, L., Oertner, T. G., Looger, L. L. and Svoboda, K. (2012). Multiple dynamic representations in the motor cortex during sensorimotor learning. Nature 484 473–478.
  • Jewell, S. and Witten, D. (2018). Exact spike train inference via $\ell_{0}$ optimization. Ann. Appl. Stat. 12 2457–2482.
  • Ko, H., Hofer, S. B., Pichler, B., Buchanan, K. A., Sjöström, P. J. and Mrsic-Flogel, T. D. (2011). Functional specificity of local synaptic connections in neocortical networks. Nature 473 87–91.
  • Looger, L. L. and Griesbeck, O. (2012). Genetically encoded neural activity indicators. Curr. Opin. Neurobiol. 22 18–23.
  • Maruyama, R., Maeda, K., Moroda, H., Kato, I., Inoue, M., Miyakawa, H. and Aonishi, T. (2014). Detecting cells using non-negative matrix factorization on calcium imaging data. Neural Networks 55 11–19.
  • Mellen, N. M. and Tuong, C.-M. (2009). Semi-automated region of interest generation for the analysis of optically recorded neuronal activity. Neuroimage 47 1331–1340.
  • Mishchencko, Y., Vogelstein, J. T. and Paninski, L. (2011). A Bayesian approach for inferring neuronal connectivity from calcium fluorescent imaging data. Ann. Appl. Stat. 5 1229–1261.
  • Mukamel, E. A., Nimmerjahn, A. and Schnitzer, M. J. (2009). Automated analysis of cellular signals from large-scale calcium imaging data. Neuron 63 747–760.
  • Ozden, I., Lee, H. M., Sullivan, M. R. and Wang, S. S.-H. (2008). Identification and clustering of event patterns from in vivo multiphoton optical recordings of neuronal ensembles. J. Neurophysiol. 100 495–503.
  • Pachitariu, M., Packer, A. M., Pettit, N., Dalgleish, H., Häusser, M. and Sahani, M. (2013). Extracting regions of interest from biological images with convolutional sparse block coding. In Advances in Neural Information Processing Systems 1745–1753.
  • Paninski, L., Pillow, J. and Lewi, J. (2007). Statistical models for neural encoding, decoding, and optimal stimulus design. Prog. Brain Res. 165 493–507.
  • Petersen, A., Simon, N. and Witten, D. (2018). Supplement to “SCALPEL: Extracting neurons from calcium imaging data.” DOI:10.1214/18-AOAS1159SUPP.
  • Pnevmatikakis, E. A., Soudry, D., Gao, Y., Machado, T. A., Merel, J., Pfau, D., Reardon, T., Mu, Y., Lacefield, C., Yang, W. et al. (2016). Simultaneous denoising, deconvolution, and demixing of calcium imaging data. Neuron 89 285–299.
  • Prevedel, R., Yoon, Y.-G., Hoffmann, M., Pak, N., Wetzstein, G., Kato, S., Schrödel, T., Raskar, R., Zimmer, M., Boyden, E. S. and Vaziri, A. (2014). Simultaneous whole-animal 3D imaging of neuronal activity using light-field microscopy. Nat. Methods 11 727–730.
  • Rochefort, N. L., Jia, H. and Konnerth, A. (2008). Calcium imaging in the living brain: Prospects for molecular medicine. Trends in Molecular Medicine 14 389–399.
  • Shen, H. (2016). Brain-data gold mine could reveal how neurons compute. Nature 535 209–210.
  • Simon, N., Friedman, J., Hastie, T. and Tibshirani, R. (2013). A sparse-group lasso. J. Comput. Graph. Statist. 22 231–245.
  • Smith, S. L. and Häusser, M. (2010). Parallel processing of visual space by neighboring neurons in mouse visual cortex. Nature Neuroscience 13 1144–1149.
  • Sonka, M., Hlavac, V. and Boyle, R. (2014). Image Processing, Analysis, and Machine Vision. Cengage Learning, Boston, MA.
  • Svoboda, K. and Yasuda, R. (2006). Principles of two-photon excitation microscopy and its applications to neuroscience. Neuron 50 823–839.
  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 267–288.
  • Vogelstein, J. T., Packer, A. M., Machado, T. A., Sippy, T., Babadi, B., Yuste, R. and Paninski, L. (2010). Fast nonnegative deconvolution for spike train inference from population calcium imaging. Journal of Neurophysiology 104 3691–3704.
  • Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B. Stat. Methodol. 68 49–67.
  • Zhou, P., Resendez, S. L., Stuber, G. D., Kass, R. E. and Paninski, L. (2016). Efficient and accurate extraction of in vivo calcium signals from microendoscopic video data. Preprint. Available at arXiv:1605.07266.

Supplemental materials

  • Supplementary Materials for “SCALPEL: Extracting neurons from calcium imaging data”. We provide additional results including the technical details of SCALPEL’s Step 3 and analyses illustrating the sensitivity of results to changes in SCALPEL’s tuning parameters.