The Annals of Applied Statistics

Nonparametric multi-level clustering of human epilepsy seizures

Drausin F. Wulsin, Shane T. Jensen, and Brian Litt

Understanding neuronal activity in the human brain is an extremely difficult problem both in terms of measurement and statistical modeling. We address a particular research question in this area: the analysis of human intracranial electroencephalogram (iEEG) recordings of epileptic seizures from a collection of patients. In these data, each seizure of each patient is defined by the activities of many individual recording channels. The modeling of epileptic seizures is challenging due the large amount of heterogeneity in iEEG signal between channels within a particular seizure, between seizures within an individual, and across individuals. We develop a new nonparametric hierarchical Bayesian model that simultaneously addresses these multiple levels of heterogeneity in our epilepsy data. Our approach, which we call a multi-level clustering hierarchical Dirichlet process (MLC-HDP), clusters over channel activities within a seizure, over seizures of a patient and over patients. We demonstrate the advantages of our methodology over alternative approaches in human EEG seizure data and show that its seizure clustering is close to manual clustering by a physician expert. We also address important clinical questions like “to which seizures of other patients is this seizure similar?”

Ann. Appl. Stat., Volume 10, Number 2 (2016), 667-689.

Received: February 2013
Revised: January 2015
First available in Project Euclid: 22 July 2016

Epilepsy seizures intracranial electroencephalogram (iEEG) Dirichlet process nonparametric Bayes clustering


Wulsin, Drausin F.; Jensen, Shane T.; Litt, Brian. Nonparametric multi-level clustering of human epilepsy seizures. Ann. Appl. Stat. 10 (2016), no. 2, 667--689. doi:10.1214/15-AOAS851.

Supplemental materials

  • Supplement to “Nonparametric multi-level clustering of human epilepsy seizures”. We present visual summaries of the seizures for each of our patients. We provide details of our Normal model and conjugate prior. We explore the sensitivity of our results to different priors for the concentration parameters as well as the influence of individual patients. We outline detailed algorithms of our MCMC model implementation as well as a comparison to alternative sampling schemes. We gives mathematical expressions for the 6 Schiff features mentioned in our results section. We provide a synthetic data comparison to the NDP model.