The Annals of Applied Statistics

Nonparametric multi-level clustering of human epilepsy seizures

Drausin F. Wulsin, Shane T. Jensen, and Brian Litt

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


Understanding neuronal activity in the human brain is an extremely difficult problem both in terms of measurement and statistical modeling. We address a particular research question in this area: the analysis of human intracranial electroencephalogram (iEEG) recordings of epileptic seizures from a collection of patients. In these data, each seizure of each patient is defined by the activities of many individual recording channels. The modeling of epileptic seizures is challenging due the large amount of heterogeneity in iEEG signal between channels within a particular seizure, between seizures within an individual, and across individuals. We develop a new nonparametric hierarchical Bayesian model that simultaneously addresses these multiple levels of heterogeneity in our epilepsy data. Our approach, which we call a multi-level clustering hierarchical Dirichlet process (MLC-HDP), clusters over channel activities within a seizure, over seizures of a patient and over patients. We demonstrate the advantages of our methodology over alternative approaches in human EEG seizure data and show that its seizure clustering is close to manual clustering by a physician expert. We also address important clinical questions like “to which seizures of other patients is this seizure similar?”

Article information

Ann. Appl. Stat., Volume 10, Number 2 (2016), 667-689.

Received: February 2013
Revised: January 2015
First available in Project Euclid: 22 July 2016

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Epilepsy seizures intracranial electroencephalogram (iEEG) Dirichlet process nonparametric Bayes clustering


Wulsin, Drausin F.; Jensen, Shane T.; Litt, Brian. Nonparametric multi-level clustering of human epilepsy seizures. Ann. Appl. Stat. 10 (2016), no. 2, 667--689. doi:10.1214/15-AOAS851.

Export citation


  • Adeli, H., Zhou, Z. and Dadmehr, N. (2003). Analysis of EEG records in an epileptic patient using wavelet transform. J. Neurosci. Methods 123 69–87.
  • Bartolomei, F., Cosandier-Rimele, D., McGonigal, A., Aubert, S., Régis, J., Gavaret, M., Wendling, F. and Chauvel, P. (2010). From mesial temporal lobe to temporoperisylvian seizures: A quantified study of temporal lobe seizure networks. Epilepsia 51 2147–2158.
  • Casella, G. and Robert, C. P. (1996). Rao–Blackwellisation of sampling schemes. Biometrika 83 81–94.
  • Chan, A. M., Sun, F. T., Boto, E. H. and Wingeier, B. M. (2008). Automated seizure onset detection for accurate onset time determination in intracranial EEG. Clin. Neurophysiol. 119 2687–2696.
  • Chaovalitwongse, W. A. (2008). Novel quadratic programming approach for time series clustering with biomedical application. J. Comb. Optim. 15 225–241.
  • de Tisi, J., Bell, G. S., Peacock, J. L., McEvoy, A. W., Harkness, W. F. J., Sander, J. W. and Duncan, J. S. (2011). The long-term outcome of adult epilepsy surgery, patterns of seizure remission, and relapse: A cohort study. Lancet 378 1388–1395.
  • Engel, J. Jr. and Pedley, T. A., eds. (2008). Epilepsy: A Comprehensive Textbook, 2nd ed. Lippincot Williams & Wilkins, Philadelphia, PA.
  • Ferguson, T. S. (1973). A Bayesian analysis of some nonparametric problems. Ann. Statist. 1 209–230.
  • French, J. A. (2007). Refractory epilepsy: Clinical overview. Epilepsia 48 3–7.
  • Gelfand, A. E. and Smith, A. F. M. (1990). Sampling-based approaches to calculating marginal densities. J. Amer. Statist. Assoc. 85 398–409.
  • Gelman, A., Carlin, J. B., Stern, H. S. and Rubin, D. B. (2004). Bayesian Data Analysis, 2nd ed. Chapman & Hall/CRC, Boca Raton, FL.
  • Geman, S. and Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. 6 721–741.
  • Ghosh-Dastidar, S., Adeli, H. and Dadmehr, N. (2008). Principal component analysis-enhanced cosine radial basis function neural network for robust epilepsy and seizure detection. IEEE Trans. Biomed. Eng. 55 512–518.
  • Hastie, T., Tibshirani, R. and Friedman, J. (2001). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, New York.
  • Hegde, A., Erdogmus, D., Shiau, D. S., Principe, J. C. and Sackellares, C. J. (2007). Clustering approach to quantify long-term spatio-temporal interactions in epileptic intracranial electroencephalography. Comput. Intell. Neurosci. 2007 83416.
  • Klatchko, A., Raviv, G., Webber, W. R. and Lesser, R. P. (1998). Enhancing the detection of seizures with a clustering algorithm. Electroencephalogr. Clin. Neurophysiol. 106 52–63.
  • Liu, J. S., Wong, W. H. and Kong, A. (1994). Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes. Biometrika 81 27–40.
  • MacKay, D. J. C. (2003). Information Theory, Inference and Learning Algorithms. Cambridge Univ. Press, New York.
  • Ossadtchi, A., Greenblatt, R. E., Towle, V. L., Kohrman, M. H. and Kamada, K. (2010). Inferring spatiotemporal network patterns from intracranial EEG data. Clin. Neurophysiol. 121 823–835.
  • Paramanathan, P. and Uthayakumar, R. (2008). Application of fractal theory in analysis of human electroencephalographic signals. Comput. Biol. Med. 38 372–378.
  • Pitman, J. (2002). Poisson–Dirichlet and GEM invariant distributions for split-and-merge transformation of an interval partition. Combin. Probab. Comput. 11 501–514.
  • Quyen, M. L. V., Soss, J., Navarro, V., Robertson, R., Chavez, M., Baulac, M. and Martinerie, J. (2005). Preictal state identification by synchronization changes in long-term intracranial EEG recordings. Clin. Neurophysiol. 116 559–568.
  • Rand, M. (1971). Objective criteria for the evaluation of methods clustering. J. Amer. Statist. Assoc. 66 846–850.
  • Reijneveld, J. C., Ponten, S. C., Berendse, H. W. and Stam, C. J. (2007). The application of graph theoretical analysis to complex networks in the brain. Clinical Neurophysiology 118 2317–2331.
  • Rodríguez, A., Dunson, D. B. and Gelfand, A. E. (2008). The nested Dirichlet process. J. Amer. Statist. Assoc. 103 1131–1144.
  • Saraceno, B., Avanzini, G. and Lee, P. (2005). Atlas: Epilepsy care in the world. Technical report, World Health Organization, Geneva.
  • Schiff, S. J., Sauer, T., Kumar, R. and Weinstein, S. L. (2005). Neuronal spatiotemporal pattern discrimination: The dynamical evolution of seizures. NeuroImage 28 1043–1055.
  • Sethuraman, J. (1994). A constructive definition of Dirichlet priors. Statist. Sinica 4 639–650.
  • Shannon, C. E. (1948). A mathematical theory of communication. Bell Syst. Tech. J. 27 379–423, 623–656.
  • Srinivasan, V., Eswaran, C. and Sriraam, N. (2007). Approximate entropy-based epileptic EEG detection using artificial neural networks. IEEE Trans. Inf. Technol. Biomed. 11 288–295.
  • Stam, C. J. (2005). Nonlinear dynamical analysis of EEG and MEG: Review of an emerging field. Clin. Neurophysiol. 116 2266–2301.
  • Teh, Y. W., Jordan, M. I., Beal, M. J. and Blei, D. M. (2006). Hierarchical Dirichlet processes. J. Amer. Statist. Assoc. 101 1566–1581.
  • Wulsin, D. F., Jensen, S. T. and Litt, B. (2016). Supplement to “Nonparametric multi-level clustering of human epilepsy seizures.” DOI:10.1214/15-AOAS851SUPP.

Supplemental materials

  • Supplement to “Nonparametric multi-level clustering of human epilepsy seizures”. We present visual summaries of the seizures for each of our patients. We provide details of our Normal model and conjugate prior. We explore the sensitivity of our results to different priors for the concentration parameters as well as the influence of individual patients. We outline detailed algorithms of our MCMC model implementation as well as a comparison to alternative sampling schemes. We gives mathematical expressions for the 6 Schiff features mentioned in our results section. We provide a synthetic data comparison to the NDP model.