Analysis of spike train data: Alignment and comparisons using the extended Fisher-Rao metric

: We present a metric-based framework for analyzing statistical variability of the neural spike train data that was introduced in an earlier paper on this section [ 14]. Treating the smoothed spike trains as functional data, we apply the extended Fisher-Rao Riemannian metric, ﬁrst introduced in Srivastava et al. [9], to perform: (1) pairwise alignment of spike functions, (2) averaging of multiple functions, and (3) alignment of spike functions to the mean. The last item results in separation phase and amplitude components from the functional data. Further, we utilize proper metrics on these components for classiﬁcation of activities represented by spike trains. This approach is based on the square-root slope function (SRSF) representation of functions that transforms the Fisher-Rao metric into the standard L 2 metric and, thus, simpliﬁes computations. We compare our registration results with some current methods and demonstrate an application of our approach in neural decoding to infer motor behaviors.


Introduction
While introducing the spike train data being considered here, Wu et al. [14] describe a need for phase-amplitude separation in functional data obtained by smoothing spike trains.They argue that phase and amplitude components contain specific information about underlying neural signals, and one can facilitate decoding using metrics involving these two components individually.Here we will apply a recently-developed method for function registration to analyze smoothed spike trains.As described in [14], the data is a set of spike trains in primate motor cortex recorded during four different movement patterns.Our goal is to register, or align, the smoothed spike trains using nonlinear warping functions.Then, we seek to use these components to quantify the temporal variability in the data and perform neural decoding with respect to different behaviors.We remark that the extended Fisher-Rao method is developed for real-valued functions that are absolutely continuous.Since the recorded dis-crete spike trains have already been smoothed using a Gaussian kernel, this requirement is easily met.
In the following we briefly introduce the methodology and then present a variety of results on smoothed spike trains involving their registration and analysis.

Extended Fisher-Rao framework
This framework was originally motivated by shape analysis of curves in Euclidean spaces [8] and was introduced for functional data analysis in the papers [9,4,11].It has also been used in several accompanying papers, e.g.[5,12,1], in this special section.We summarize the main ideas and refer the reader to these papers for details.
Mathematical setup For simplification on presentation, we rescale the time domain to be [0, 1] and let H be the set of orientation-preserving diffeomorphisms of the unit interval [0, 1]: h is a diffeo}.The elements of H form a group (with composition being the binary operation) with identity element being h id (t) = t.We will use f to denote the L 2 norm ( . Let F denote the set of all absolutely continuous functions on the interval [0, 1].For any f ∈ F , we define its square-root slope function (SRSF) as q : [0, 1] → R, q(t) ≡ sign( ḟ (t)) | ḟ (t)|.It can be shown that that if f is absolutely continuous then its resulting SRSF is square integrable.Thus, we will define L 2 ([0, 1], R) (or simply L 2 ) to be the set of all SRSFs.For every q ∈ L 2 there exists a function f (unique up to a constant, or a vertical translation) such that the given q is the SRSF of that f .If we warp a function f by h, the SRSF of f • h is given by: q(t) = q(h(t)) ḣ(t).We will denote this transformation by (q, h) = (q • h) ḣ.
Let F 0 ⊂ F denote the set of functions f such that ḟ > 0. For any f ∈ F 0 and v 1 , v 2 ∈ T f (F 0 ), where T f (F 0 ) is the tangent space to F 0 at f , the Fisher-Rao Riemannian metric is defined as the inner product: It is the nonparametric version of the more common parametric form used in statistics [7,3].This metric is somewhat complicated for directly using in functional data analysis.For instance, it is not easy to derive geodesic equation between arbitrary points in F under this metric.However, a small transformation provides an enormous simplification of this task.It has been shown in [9,8] that under the SRSF representation, the Fisher-Rao Riemannian metric becomes the standard L 2 metric.Furthermore, this construction can be extended from F 0 to full F , since the space of SRSFs of all elements of F is simply the L 2 space.Therefore, it is called an extension of the Fisher-Rao metric.As a consequence, the corresponding geodesics in the SRSF space are straight lines and the geodesic distance is simply the L 2 norm.More precisely, if we let q 1 , q 2 ∈ L 2 denote the SRSFs of f 1 , f 2 ∈ F , respectively, then the extended Fisher-Rao distance between them in F is simply The success of this framework comes from an important isometry property.More precisely, for any two SRSFs q 1 , q 2 ∈ L 2 and h ∈ H, (q 1 , h) − (q 2 , h) = q 1 − q 2 .Due to this isometry, we can formally define phase and amplitude components of functions, and impose distances on these components as follows.
The amplitude of a function f is formally defined to be its orbit under H, given by [f ] = closure{f •h|h ∈ H}; it is the set of all warpings of a function, and their limit points.The corresponding set in the SRSF space is: [q] = closure{(q, h)| h ∈ H} ⊂ L 2 , where q is the SRSF of f .Let S denote the set of all such orbits in L 2 .To compare amplitudes of any two functions we need a metric on S.
Definition 1.For any two functions f 1 , f 2 ∈ F and the corresponding SRSFs, q 1 , q 2 ∈ L 2 , we define the amplitude distance d a to be: and the optimal time warping from q 2 to q 1 is computed as: h * is also called the relative phase of f 2 with respect to f 1 .One can show that d a is a proper distance (i.e. it satisfies positive definiteness, symmetry, and the triangle inequality) on S. The solution to the optimization in Eqn. 3 comes from a dynamic programming algorithm in a discretized domain [9].Eqn. 2 also provides a tool for pairwise registration, or separation of relative phase, between two functions.
Multiple function registration For a set of given functions, our goal is to separate their phase and amplitude components by warping/aligning them to a template.The template is selected such that its amplitude and phase denote the mean amplitudes and mean phases of the given functions.It is constructed in three steps: 1) For a given collection of functions {f i }, and their SRSFs {q i }, we compute the mean of their amplitudes {[q i ]} under d a in S; we will call it [µ].
2) We select an element, call it µ q , of this orbit in such a way that the relative phases of q i s to µ q , obtained via using Eqn.2, have an average of h id .3) The {h i }s are then used to align {f i } using fi = f i • h i .The set { fi } forms the amplitudes and the set {h i } forms the phases of {f i }.The details are presented next.
Step 1 (Mean amplitude function in S).At first, we consider the problem of finding the mean of the amplitudes of given functions.Definition 2. Define the mean amplitude [µ] of the given amplitudes {[q i ]} as a local minimum of the sum of squares of amplitude distances: We emphasize that the amplitude mean [µ] is actually an orbit, rather than being a single function.The full algorithm for computing the mean amplitude is given next.
1. Initialization step: Select µ = q j , where j is any index in argmin 1≤i≤n ||q i − For each q i find h * i by solving: h * i = argmin h∈H µ − (q i , h) .3. Compute the aligned SRSFs using qi → (q i , h * i ).Step 2 (Mean relative phase).So far [µ] is just an amplitude and does not have a phase.We allocate it a phase which is the mean of phases associated with the given functions {f i }.The resulting function is treated as a template or a reference for final alignment.This construction is accomplished through a notion called the center of an orbit (see [9]).Definition 3.For a given set of SRSFs q 1 , q 2 , . . ., q n and q, define an element q of [q] as the center of the orbit [q] with respect to the set {q i } if the warping functions {h i }, where h i = argmin h∈H q − (q i , h) , have the mean h id .
In other words, the relative phases of given functions with respect to this center is identity.The center is constructed as follows.
Algorithm 2 (Center of an orbit).WLOG, let q be any element of the orbit [q]. 1.For each q i find h i by solving: h i = argmin h∈H q − (q i , h) .2. Compute the mean hn of all {h i }.The center of [q] wrt {q i } is given by q = (q, h−1 n ).Let h * i be the warping that aligns q i to q.One can show that h * i = h i • h−1 n , and therefore the mean of {h * i } is hn • h−1 n = h id .We will apply this setup in our problem by finding the center of [µ] with respect to the SRSFs {q i }.Now we can utilize Algorithms 1 and 2 to present the full procedure for functional alignment.
Phase-amplitude separation algorithm We are given a set of functions {f i } on [0, 1].Let {q i } denote the SRSFs of {f i }, respectively.
1. Computer the mean amplitude of {[q i ]} in S using Algorithm 1. Denote it by [µ]. 2. Find the center of [µ] w.r.t.{q i } using Algorithm 2; call it µ q .3. For i = 1, 2, . . ., n, find the relative phases h * i by solving: h * i = argmin h∈H µ q − (q i , h) .4. Return the warping functions {h * i }, the aligned SRSFs qi = (q i , h * i ), and the aligned functions fi = The aligned function { fi } are used to display the amplitudes and {h * i } denote the phases of the input functions.

Experimental results
Here we present the registration and decoding results for the spike train data using this framework.

Within class registration
The phase-amplitude separation of 60 functions in Path 1, using our algorithm, is shown in Fig. 1.The original functions {f i } are shown in Panel (A), and their cross-sectional mean and mean ± standard deviations (std) of {f i } are shown in Panel (B).The relative phases {h * i } and the corresponding aligned functions { fi } are shown in Panels (C) and (D), respectively.We see that most of the warping functions are around the identity h id (t) = t.This indicates that spike train functions in Path 1 only have slight variability in time axis for registration.Finally, we show the cross-sectional mean and mean ± std of { fi } in Panel (E).It is observed that while the given data (in Panel (A)) shows a lot of phase variability (in peak and valley locations), this variability is removed with a tighter alignment of functions with sharper peaks and valleys (in Panel (D)).
The registrations within Paths 2, 3 and 4 are shown in three rows, respectively, in Fig. 2. The three panels in each row show the original functions, the phase components, and the aligned functions, respectively.Similar to the result in Path 1, all functions can be properly aligned with slight adjustments on the time axis.Comparison with other alignment methods Here we compare our alignment performance with several previous methods -principal analysis by conditional expectation (PACE) presented in [10], the self-modeling registration (SMR) in [2], and the minimum second eigenvalue (MSE) in [6].Due to space limitation, we only show the registration result of each method only on spike data from Path 1 in Fig. 3.We see that, as compared to the original unaligned data (see Fig. 1A), the PACE, SMR and MSE methods do not fully align the main features (i.e.peaks and valleys) in the spike train functions; we can still observe many unaligned features.In contrast, our method appears to perfectly align all peaks and valleys in the data, and the registration performance is clearly superior to the other three methods.Moreover, this method is completely parameter free.

Across class registration
We have shown the registration result of spike trains within each path.All these functions can also be pooled and aligned together.For example, we can put together all 120 spike train functions in Paths 1 and 4, and then apply the alignment algorithm.The result is shown in the first row of Fig. 4. We can see that these functions are very well aligned (shown in the fourth panel).Similarly, we can register the 120 spike trains in Paths 2 and 3 (see first panel, row 2 in Fig. 4), where functions in Path 2 concentrate around (1-3 sec) whereas the functions in Path 3 concentrate around (2-4 sec).The aligned functions are shown in the fourth panel in the bottom row.By removing the phase variability, these functions are concentrated between (1-4 sec).Looking at the warping functions in Path 2 (bottom row, second panel), we find similar pattern as that in Path 1 (moving rightward).In this case, the warping is relatively more moderate due to lower contrast to Path 3 in the time domain.In contrast, warping functions in Path 3 (bottom row, third panel) are slightly below the 45 • line which indicates moving in the left direction.

Neural decoding using phase distance
From Figs. 1 and 2, we can see that the optimal warpings are close to identity h id (t) = t within the same path.However, the optimal warpings between spike trains from different paths can be more drastic because the their dominant peaks   distribute differently.Here we will utilize this temporal difference to decode neural activity and infer the movement path from the given neural signals.To emphasize those phase differences, we will estimate the relative phases of these spike trains functions directly.For f 1 , f 2 ∈ F , let h * be the relative phase of f 2 with respect to f 1 as given in Eqn. 3 (defined using the SRSFs).The relative phase difference between these two functions can be quantified as the intrinsic distance between h * and the identity h id .We denote this distance in time warping as phase distance [13] and compute it as: For all 240 trials (60 trials in each path) in the data, we compute all pairwise d p distances.As a summary, we found that the average distances within the same path (over all 4 paths) is 0.54 ± 0.09, but across different paths is 0.60 ± 0.05.This indicates that the phase-distances can properly address the dissimilarity of spike trains between different paths, and therefore can be used to infer the movement path using neural activity.Based on these distances, we perform the standard leave-one-out cross-validation to infer movement path using observed spike train functions.It is found that the decoding accuracy is 71% with respect to the 4 different paths.This result clearly outperforms a random guess with 25% accuracy.

Summary
Characterizing the temporal variability of neural firing pattern has been a central problem in neural coding.In this paper, we apply the extended Fisher-Rao Riemannian framework to separate phase-amplitude components in spike train functions and to measure the differences in these components across observations.Experimental results shows that this framework is very successful in phase-amplitude separation, and the performance is superior to several previous methods.We also present some preliminary neural decoding a distance in the time domain (i.e., variability in the phase), and significant improvement over random guess is obtained.

4 .
If the increment 1 n n i=1 qi −µ is small, then stop.Else, update the mean using µ → 1 n n i=1 qi and return to step 2. The iterative update in Steps 2-4 is based on the gradient of the cost function given in Eqn. 4.

6 (Fig 1 .
Fig 1. (A) Given spike train functions in Path 1. (B) The mean and mean ± std of the given functions.(C) The optimal warping function for registration.(D) Aligned functions.(E) The mean and mean ± std of the aligned functions.

Fig 2 .
Fig 2.First row: the given smoothed spike train functions in Path 2 (left panel), the optimal warping function for registration (middle panel), and the aligned functions (right panel).Second row and third row: same as the first row except for Path 3 and Path 4, respectively.
Fig 4.First row: the given smoothed spike train functions in Paths 1 and 4 (first panel), the optimal warping function in Path 1 (second panel) and Fig 4.First row: the given smoothed spike train functions in Paths 1 and 4 (first panel), the optimal warping function in Path 1 (second panel) and Path 4 (third panel), and the aligned functions (fourth panel).In all plots, solid lines are for functions in Path 1 and dashed lines are for functions in Path 4. Second row: same as the first row except for Paths 2 and 3.