Introduction to neural spike train data for phase-amplitude analysis

: Statistical analysis of spike trains is one of the central problems in neural coding, and can be pursued in several ways. One option is model-based, i.e. assume a parametric or semi-parametric model, such as the Poisson model, for spike train data and use it in decoding spike trains. The other option is metric-based, i.e. choose a metric for comparing the numbers and the placements of spikes in diﬀerent trains, and does not need a model. A prominent idea in the latter approach is to derive metrics that are based on measurements of time-warpings of spike trains needed in the alignments of corresponding spikes. We propose the use of ideas developed in functional data analysis, namely the deﬁnition and separation of phase-amplitude components, as a novel tool for analyzing spike trains and decoding underlying neural signals. For concreteness, we introduce a real spike train dataset taken from experimental recordings of the primary motor cortex of a monkey while performing certain arm movements. To facilitate functional data analysis, one needs to smooth the observed discrete spike trains with Gaussian kernels.


Introduction
In the nervous system, the neurons propagate time-dependent information over large distances via sequences of characteristic electrical pulses called action potentials or, more simply, spikes [4].These sequences (or spike trains) have been W. Wu et al. commonly regarded as the language of the brain and are the focus of much investigation.The decoding of these signals to extract underlying information is an important scientific problem of our times.Scientists across many disciplines are working on developing methods that analyze spike train data and extract information regarding limb movement, visual processing, and brain diagnostics.Due to significant variability in neural firing activity, even for same tasks and stimuli, a precise mathematical representation of spike trains is difficult and statistical modeling has naturally gained prominence [3,8].There is a large ongoing effort in computational statistics techniques for analyzing neuroscience data where one attempts to characterize the temporal structure of spike trains using parametric or semi-parametric stochastic processes such as Gaussian processes, Poisson processes, or general point process [2,7].The latest ideas in this area of research can be found in a biennial international workshop titled Statistical Analysis of Neuronal Data or SAND 1 .These statistical approaches have built solid foundations for neural data representation and have led to a variety of tools for statistical analysis of spike trains.
Alternatively, there have been several efforts in developing metric-based ideas for analyzing spike trains that rely on measurement of distances or dissimilarities between spike trains.These ideas do not require a precise statistical model for spike train data upfront.Examples include the distances in discrete state, discrete time models [9,11], discrete state, continuous time models [14,1,13,5], and those in continuous state, continuous time models [12,6].These methods have mainly focused on the clustering or classification of spike trains with respect to different stimuli, and are applied for neural decoding in various sensory and motor systems.
Focusing on metric-based approaches, the most widely used metric in computational neuroscience is the Victor-Purpura (VP) metric [14].This method measures both temporal and rate differences between spike trains.That is, the distance is based on the differences in the numbers of spikes in the two trains as well as their relative time locations.The differences in spike locations are measured using a warping function that optimally aligns the two given trains or maps the spike times of one train to those of the other.When seen from a functional data analysis perspective [10], it is the registration function between two spike trains.In this article we motivate the use of function registration techniques in spike train data analysis.This study is aided by a real spike train recording from neurons in primate motor cortex during four different movement behaviors [15].Our goal is to register, or align, the temporal structure of spike train data using time-warping functions.These functions can then be used to quantify the temporal variability in the data and perform neural decoding with respect to different behaviors.Note that registration methods are generally developed for continuous functions, rather than point processes such as spike trains.To meet this requirement, we will first smooth the recorded spike data using a Gaussian kernel function.

Neural spike train data
In this section we introduce a spike train dataset from a real experimental recording collected in the Hatsopoulos Lab and previously used in a metric-based study of spike trains [15,16].This dataset is then used to motivate the use of the function registration methods for spike train alignment, metric computation, and neural decoding.

Raw experimental recording
In this experiment the process of recording electrical signals generated at the cellular level is as follows.A silicon microelectrode array containing 100 platinizedtip electrodes was implanted in the arm area of primary motor cortex in a juvenile female macaque monkey (Macaca mulatta).The electrodes recorded electrical signals during different activities and signals were filtered, amplified and recorded digitally using a Cerebus acquisition system (developed by Cyberkinetics Neurotechnology Systems Inc.).Single units were manually extracted using an offline sorter developed by Plexon Inc.
In this setting a subject monkey was trained to perform a closed Squared-Path (SP) task by moving a cursor to targets via contralateral arm movements in the horizontal plane.Basically, the targets in the SP task are all fixed at the four corners of a square and the movement is stereotyped (Fig. 1A).At the start of a trial, the target appeared at one corner and the monkey had to move the cursor to reach it.Once reached, the target jumped counterclockwise to the next corner and the monkey had to move the cursor again.The trial ended when the cursor reached the starting corner.That is, in each trial the subject W. Wu et al. touched a sequence of 5 targets which were the four corners of the square, with the first and the last being the same.Each sequence of 5 targets defined a path, and there were four different paths in the SP task (depending on the starting vertex).In this experiment, we recorded 60 trials for each path and, thus, the total number of trials was 240.
To generate function data for phase-amplitude separation, we selectively choose spiking activity from one neuron which shows a significant tuning property over four different paths.The time length of each trial varies from 5 to 6 seconds.To fix a standardized time interval for all data, we normalize the kinematics and spiking activity in each trial to 5 seconds.For illustration, 10 spike trains from each movement path as well as the reaching times at the five corners are shown in Fig. 1B.It is observed that the number of spikes in each trial is about constant, but their temporal distribution is strongly based on the behavioral movement path; the temporal patterns are similar within the same path, but significantly different across difference ones.

Smoothed spike trains
So far the spike trains are discrete in the sense that the signal is exactly zero in between random spike times.For the purpose of alignment and phase-amplitude separation, a functional data with peaks and valleys is deemed better.To go from discrete spike data to continuous functional form, a standard approach is to smooth them using a continuous kernel.Here we adopt a commonly-used Gaussian kernel and the smoothing is simply a convolution operation.
Let s(t) be a spike train with spike times 0 < t 1 < t 2 < • • • < t M < T , where [0, T ] denotes the recording time domain.Then, s(t) can be expressed as, where δ(•) is the Dirac delta function.We smooth the spike trains using a Gaussian kernel K σ (t) = e −t 2 /(2σ 2 ) /( √ 2πσ), where σ = 50ms denotes the kernel width.Therefore, the smoothed spike train is ( This smoothing process is illustrated using one example train in Fig. 2, where we show the original spike times {δ(t − t i )}, the Gaussian kernels {K σ (t − t i )}, and the smoothed spike train K σ (t − t i ).All 60 smoothed spike trains in each path are shown in Fig. 3.These functions have been sampled at 20Hz for use in computer programs.From these data, we observe that the smoothed functions have similar pattern within each class; for example, they have similar number of peaks and the locations of these peaks are only slightly different.However, the peak locations across different paths are significantly different.For the purpose of registration, we will align these functions by removing such phase variability.This phase variability can also be used to compare any two functions.

Goals: Alignment, metrics and decoding
Now that the spike trains have been transformed into smooth functions, they are ready for functional data analysis.There are several immediate goals that can At first, we introduce some notation.For simplification of presentation, we fix the time domain as [0, 1] (This domain can be easily adapted to any other time interval).Let H be the set of orientation-preserving diffeomorphisms of the unit interval [0, 1]: Elements of H form a group, i.e. (1) for any h 1 , h 2 ∈ H, their composition h 1 • h 2 ∈ H; and (2) for any h ∈ H, its inverse h −1 ∈ H, where the identity is the self-mapping h id (t) = t.

Alignment of smoothed spike trains
Let {f i : [0, 1] → R + |i = 1, 2, . . ., n} represent a set of smoothed spike trains.In the first goal -the temporal alignment of spike trains -we want to find time-warping functions h i s such that the set {f i • h i } are well aligned.The aligned functions {f i •h i } represent the amplitude components while the warping function {h i } represent the phase components of the given data.
As an example of phase-amplitude separation of spike train data, we show a result based on all spike trains resulting from Path 1 in Fig. 4. Basically, a set of 60 smoothed functions {f i } forms the original data and is shown in panel (A).The cross-sectional mean and mean ± standard deviation (std) of {f i } are shown in panel (B).While different techniques for optimal alignments are presented in accompanying papers, we simply use an arbitrary method here to illustrate the ideas.The optimal warping functions for registration {h * i } and the corresponding aligned functions { fi = f i • h * i } under this method are shown in panels (C) and (D), respectively.We see that most of the warping functions {h * i } are around the identity h id (t) = t.This indicates that spike train functions in Path 1 only have slight variability along the time axis for registration.Finally, we show the cross-sectional mean and mean ± std of { fi } in panel (E).It is observed that the given data (in panel (A)) has a lot of phase variability (in peak and valley locations), and such variability is removed with a tight alignment of functions with sharper peaks and valleys (in panel (D)).This means that the effects of phase variability have been completely removed and only the amplitude variability remains.Also, the plot of mean ± std in panel (E) shows a much tighter arrangement of bands around the mean function due to the alignment as compared to the one before alignment (in panel (B)).
In addition to alignment of spike trains within the same activity class, there is a great interest in studying the phase and amplitude variability in spikes belonging to different classes.To illustrate that idea, we take spike trains from multiple classes and jointly align them.For example, we can combine all 120 spike train functions in Paths 1 and 4 (shown in the top, left panel of Fig. 5) and then apply an alignment algorithm on them.We can see that these functions can be well aligned (shown in the top, rightmost panel).Note that functions in Path 1 concentrate around the beginning time (0-2 sec) whereas the functions in Path 4 concentrate around ending time (3-5 sec).Therefore, most warping functions in Path 1 (top row, second panel) are slightly above the 45 • line, which indicates that the concentrated period slightly moves right in the time domain.In contrast, most warping functions in Path 4 (first row, third panel) are slightly below the 45 • line which indicates a movement in the left direction.
Similarly, we can register the 120 spike trains in Paths 2 and 3 (see bottom row, first panel in Fig. 5), where functions in Path 2 concentrate around 1-3 sec whereas the functions in Path 3 concentrate around 2-4 sec.The aligned functions are shown in the fourth panel in the bottom row.By removing the phase variability along the time axis, these functions retain the concentration around 1-4 sec.Looking at the warping functions in Path 2 (bottom row, second panel), we find a similar pattern as that in Path 1 (moving rightward).In this case, the warping is relatively more moderate due to lower contrast to Path 3 in the time domain.Similarly, warping functions in Path 3 (bottom row, third panel) are slightly below the 45 • line which indicates a shift in the left direction.

Neural decoding using warping metrics
From Fig. 1(B), we can see that the spike trains in all paths have approximately the same total number of spikes.However, the temporal distributions, or phase components, of spike trains in different paths are very different from each other.Basically, this motor cortical neuron has high firing rates when the hand moves upward, but relatively lower rates in the other three directions.This can also be observed in Fig. 3 where the smoothed functions have larger value during upward movements.We expect to find that the optimal warpings are close to identity h id (t) = t within the same path, but can be more drastic across different paths.One can utilize metrics between amplitude and phase components, or a combination or both, to decode neural activity and potentially infer the movement path from the given neural signals.In the preliminary experiments, there is evidence that the phase distances can properly address the dissimilarity of spike trains between different paths, and therefore can be used to infer the movement path using neural activity.

Conclusion
A statistical analysis of spike trains is central in decoding neural signals.Due to the inherent nature of spike trains, the phase-amplitude separation of functional (smoothed) versions of spike trains can play an important role in this decoding.The alignment of smoothed spike trains separates the phase and amplitude components which, in turn, can provide important clues about the information conveyed in those spike trains.We introduce a real dataset obtained from experimental recordings of spike trains from the motor cortex of a monkey during hand movement tasks.This recording is ideally suited for studying phase-amplitude separation methods and exploring the roles of these components in classifying different hand movements.

Fig 1 .
Fig 1. (A) Four trajectories of hand movement in the SP task.The four colors (blue, red, green, and cyan) indicate the trajectories started at corners 1, 2, 3, and 4, respectively, where the corners are also shown in the correspond colors.That is, Path 1:1 → 2 → 3 → 4 → 1, Path 2: 2 → 3 → 4 → 1 → 2, Path 3: 3 → 4 → 1 → 2 → 3, and Path 4: 4 → 1 → 2 → 3 → 4.(B) 10 spike trains randomly chosen from each path.Each thin vertical line indicates the time of a spike, where the colors are the same as that in (A).Each black, thick vertical line is the time of reaching a corner.One row is for one trial.

Fig 2 .Fig 3 .
Fig 2.An illustration of spike train smoothing with a Gaussian kernel.Upper panel: one example spike train in Path 4 and the corresponding kernel functions.Lower panel: smoothed spike train.

6 (Fig 4 .
Fig 4. (A) Given smoothed spike train functions in Path 1. (B) The cross-sectional mean and mean ± std of the given functions.(C) The warping function for registration.(D) Aligned functions.(E) The cross-sectional mean and mean ± std of the aligned functions.

WFig 5 .
Fig 5. Top row: the given smoothed spike train functions in Paths 1 and 4 (first panel), the optimal warping function in Path 1 (second panel) and Path 4 (third panel), and aligned functions (last panel).In all plots, solid lines are for functions in Path 1 and dashed lines are for functions in Path 4. Bottom row: same as the top row except for Paths 2 and 3.