Central limit theorems for Fréchet means in the space of phylogenetic trees

This paper studies the characterisation, and the limiting distributions, of Frechet means in the space of phylogenetic trees. This space is topologically stratified, as well as being a CAT(0) space. We use a generalised version of the Delta method to demonstrate non-classical behaviour arising from the global topological structure of the space. In particular, we show that, for the space of trees with four leaves, although they are related to the Gaussian distribution, the forms taken by the limiting distributions depend on the co-dimensions of the strata in which the Frechet means lie.


Introduction
It has become increasingly common in various research areas for statistical analysis to involve data that lies in non-Euclidean spaces, such as manifolds or even topologically stratified spaces.Two such examples are the statistical analysis of shape (cf.[6] & [15]) and the analysis of phylogenetic trees (cf.[4]).Consequently, many statistical concepts and techniques have been generalised and developed to adapt to such phenomena.
In this paper we focus on developing a central limit theorem on the space of phylogenetic trees, which is a topologically stratified space (cf.[14]).A phylogenetic tree represents the evolutionary history of a set of organisms, and as such, is one of the main data objects in evolutionary biology.Some methods have been developed for statistically evaluating phylogenetic trees (cf.[7] & [23]), however, these approaches often do not incorporate both the tree topology and edge lengths, which could represent mutation rate for example, in a holistic way.Addressing this deficiency was one of the goals of the construction of a space of phylogenetic trees [4] in which branch lengths, and hence tree topologies, vary continuously in a natural way.This space is a piecewise Euclidean metric space, and thus approaches from Euclidean statistics can be defined and generalized in it.To date, some further statistical theory, including methods for non-parametric bootstrap and hypothesis testing, within this space has been developed by Holmes (cf.[10], [11] & [12]), while the Fréchet mean and variance within this space was defined in [17].We continue with these theoretical investigations, which fit in with the larger research goal of developing rigorous statistical analyses for topologically stratified spaces that was initiated by the working group on sampling from stratified spaces of the Statistical and Applied Mathematical Sciences Institute (SAMSI) 2010-11 program on Analysis of Object Data.
The main results of this paper show that central limit theorems hold for Fréchet means in the space of phylogenetic trees with four leaves and that the limiting distributions of the sample Fréchet means are closely associated with multivariate Gaussian distributions.In particular, we prove that there is a central limit theorem regardless of whether the Fréchet mean is in a top dimensional stratum (Theorem 3.1), in a codimension one stratum (Theorem 4.4), or at the cone point of the space (Theorem 5.2).The central limit theorems describe the behaviour of the sample Fréchet means around the true Frećhet mean, as the sample size increases.Thus, our results have implications for the statistical analysis of phylogenetic trees.One example would be hypothesis testing: if we have samples from two potentially different distributions of trees, we may be able to reject the hypothesis that the two distributions are the same by computing the Fréchet means of the samples, and comparing the distance between them with that expected under the central limit theorem.
The concept of Fréchet means of random variables on a metric space is a generalisation of the least mean-square characterisation of Euclidean means: a point is a Fréchet mean of a probability measure µ on a metric space (M , d) if it minimises the Fréchet function for µ defined by x → 1 2 M d(x, x ) 2 dµ(x ). (1.1) Various aspects of Fréchet means have been studied for non-Euclidean spaces, including Riemannian manifolds and certain stratified spaces.Among other applications, the first use of Fréchet means to provide nonparametric statistical inference, such as confidence regions and two-sample tests for discriminating between two distributions, was carried out in [2] and [3] for both extrinsic and intrinsic inference applied to manifolds, while the earlier work [8] and [9] provided similar inference restricted to extrinsic means on regular submanifolds of Euclidean spaces.We first review some of the ideas and results in the literature which are relevant to our investigation of Fréchet means on tree space.When M is a Riemannian manifold with the distance function being that induced by its Riemannian metric, the results on central limit theorems for Fréchet means can be found in [3] and [16].The result and the proof of the classical central limit theorem rely on the global linear structure of Euclidean space.Hence, a crucial step in both [3] and [16] is to find a relationship between the sample Fréchet means of a sequence of random variables in M and the sample Euclidean means of a sequence of appropriately defined random variables in Euclidean space.The way that [3] achieves this is to use an embedding of the support of the distribution of the random variables into a Euclidean space.Then, the chosen embedding maps the sequence of random variables in M to a sequence of random variables in Euclidean space.This allows the authors to apply known results in Euclidean spaces to the resulting sequence of random variables to obtain a central limit theorem where, as expected, the limiting distribution depends on the chosen embedding.Among others, [16] explores another relationship between Fréchet means and Euclidean means to study the limiting behaviour directly on the manifold itself.Since the gradient of the Fréchet function must be zero at a Fréchet mean and since grad 1 d(x, x ) 2 = 2 exp −1 x (x ), where grad 1 denotes the gradient with respect to the first variable of d 2 and exp −1 x is the inverse exponential, or logarithmic, map at a point x in M , it follows that, if x 0 is a Fréchet mean of µ, then Thus, when x 0 is a Fréchet mean of µ on the Riemannian manifold M , the origin of the tangent space of M at x 0 is the Euclidean mean of the probability measure induced on that space by the log map exp −1 x0 from µ.The difficulty arising from this method is that the log map varies with the reference point x 0 .As the number of random variables increases, their sample Fréchet mean changes.This results in the sequence of the random variables in M being mapped to different sequences of random variables in the Euclidean space as their number increases.Hence, the classical central limit theorem cannot been applied directly.To deal with this, [16] uses the notion of parallel transport in Riemannian geometry.This gives intrinsic results, which show how the global geometry of the space influences the limiting probability measure defined on the tangent space at the Fréchet mean under consideration.On the other hand, the results in both papers imply that, since manifolds are locally homeomorphic to Euclidean spaces, the limiting distributions for sample Fréchet means on Riemannian manifolds are usually Gaussian, a phenomenon similar to that for Euclidean means.
The topological structure of spaces also plays a role in the limiting behaviour of sample Fréchet means, as studied in [1] and [13].In [13], a non-classical phenomenon of central limit type theorems for Fréchet means is observed that does not occur in the case of Riemannian manifolds.This applies to metric spaces with an open book decomposition, which is isometric with a disjoint union of copies of a half Euclidean space, or 'pages', identified along their boundary hyperplanes to form the 'spine'.The features, termed 'sticky' and 'partly sticky', observed in [13] are that, under mild conditions, when the Fréchet mean of a probability measure on such a space lies in the spine, its iid sample Fréchet means will lie either on the spine or in one single half-space, for all sufficiently large sample sizes.This, in particular, implies that, in this case, the support of the limiting distribution is either the spine or on one page.
Open books are one of the simplest non-trivial topologically stratified spaces and any stratified space that is singular along a stratum of co-dimension one is locally homeomorphic to an open book along that stratum.This paper is a continuation of the investigation initiated in [13], in the direction of central limit type theorems on stratified spaces.It is also a first step towards the study of central limit type theorems for Fréchet means on the spaces of phylogenetic trees.A space of phylogenetic trees, or tree space for short, is formed from a disjoint union of Euclidean orthants of a given dimension with the identification of certain sets of faces of various co-dimensions.The simplest tree space is that for trees with three leaves and it comprises three half lines glued at their ends, so is a special case of an 'open book'.This paper concentrates on the space of phylogenetic trees with four leaves to investigate how other aspects of the global structure of the tree space influence the limiting behaviour of sample Fréchet means, including the so-called 'stickiness' feature.In the case of open books, the paper [13], by combining half-planes appropriately, in effect turns the problem into a Euclidean one as long as the Fréchet mean does not lie on the spine.Unfortunately, this feature of Fréchet means on 'open books' no longer holds for those in tree space.In particular, the Fréchet mean of a random variable in tree space has, in general, no closed form and the generalised log map for tree space is not linear with respect to the points at which it is defined.The reason for this is the existence of the 'umbral' set, that we shall define, for each given point.Although the tree space is not a Riemannian manifold, the nature of our problem is similar to that of characterising the Fréchet mean of a random variable in a Riemannian manifold.This results in a different approach to that in [13] and in the difference in the nature of the results that we obtain.We characterise the Fréchet mean of a random variable in tree space in terms of the Euclidean mean of a certain function of that random variable.Due to the global structure of the tree space, the function obtained in this way also depends on the Fréchet mean of that random variable in tree space.Although the technique of parallel transport used in [16] is inapplicable here, we use the local Euclidean structure of the top strata of tree space to analyse this dependence explicitly.Then, the various properties of such functions enable us to derive the central limit theorems on tree space.
The paper is organised as follows.In Section 2, we introduce the space Q 5 , a subspace of R 3 consisting of a cycle of five quadrants, and concentrate on the characterisation of, and the central limit theorem for, Fréchet means in Q 5 that do not lie at the origin (Proposition 2.2).Although Q 5 is globally flat away from the origin, the result and the methods used in this section make it clear that the central limit theorem for Fréchet means in Q 5 takes a different form from its counterpart in Euclidean space.In addition to their intrinsic interest, the results and the approach in this section also form a basis for our investigations of T 4 , the space of trees with four leaves, in the following sections.
The three remaining sections of the paper study the limiting distributions for Fréchet means in T 4 for the three possible co-dimensions of the strata on which they can lie.In particular, we relate T 4 to Q 5 in such a way that the result for the top-dimensional strata is a direct consequence of that in Q 5 (Section 3).Note that, although the main idea here follows closely that of general central limit theorems for M-estimators in statistics as in [3], the functions involved here are neither diffeomorphisms as in [3], nor second order differentiable like general M-estimators, except for the special case where the support of the distribution is diffeomorphic with a Euclidean space.The cases when Fréchet means lie either on co-dimension one strata (Section 4) or at the cone point (Section 5) require additional analysis and the results there take different forms.We show that, when a Fréchet mean lies on a co-dimension one stratum, the limiting distribution can take one of three possible forms, distinguished by the nature of its support.This support may be either the one-dimensional Euclidean space containing the co-dimension one stratum where the Fréchet mean lies, or a two-dimensional half Euclidean space whose boundary contains that co-dimension one stratum, or the union of two such half spaces.In contrast, when a Fréchet mean is at the cone point, the intersection of the support of the limiting distribution with any given quadrant can be either an empty set or a cone.In all these cases, the limiting distributions are linked closely with Gaussian distributions in Euclidean spaces.The Appendix contains a brief account of the underlying geometry of tree spaces.
2 Fréchet means in Q 5 Let Q 5 be the union of the five quadrants embedded in R 3 , with coordinates u, v, w as shown in Figure 1, and let d Q denote the intrinsic metric on Q 5 .Without its cone point (0, 0, 0), Q 5 is a flat non-complete Riemannian surface.However the inclusion of the cone point, by allowing the realisation of geodesics through that point, makes Q 5 into a geodesic metric space of non-positive curvature or a so-called CAT (0) space (cf.[5]).
Nevertheless, it still has the isometry that permutes the five quadrants cyclically in the obvious fashion, fixing the cone point.This implies, in particular, that the square of the distance from a fixed point in Q 5 remains differentiable on each of the open semi-axes.
In this section, we assume that µ is a probability measure on Q 5 such that its Fréchet function is finite and that {ξ l } is a sequence of iid random variables in Q 5 with probability measure µ.Then, the fact that Q 5 has non-positive curvature implies that the Fréchet mean (û, v, ŵ) of µ exists and is unique, and that, when it lies in the region where Q 5 is a Riemannian manifold, i.e. when (û, v, ŵ) = (0, 0, 0), it is characterised by (cf. [20] & [16] respectively).We shall exclude the case where the Fréchet mean is at the cone point.Then, the symmetry of the five quadrants of Q 5 implies that, without loss of generality, we may restrict ourselves to the case where (û, v, ŵ) lies in the interior of the subset of Q 5 determined by w = 0.
Consider Figure 1 for a fixed u 0 > 0 and v 0 0. The geodesic in Q 5 from the point (u 0 , v 0 , 0) to a point in the (closed) dark shaded areas either is a straight linear segment in the full (u, v)-plane, or becomes a straight linear segment when the relevant quadrant is 'folded down' into the (u, v)-plane.The light grey shading in Figure 1 shows the set of points q in Q 5 from which the geodesic to (u 0 , v 0 , 0) is a bent line that is the union of two segments: one from (u 0 , v 0 , 0) to the origin and the other from the origin to q.We denote the union of the darker (open) shaded regions by I Q (u0,v0) and the 'umbral' set, the union of the (closed) lighter shaded regions, by U Q (u0,v0) .For example, in the extreme case when v 0 = 0, and so α = 0 in Figure 1, U Q (u0,v0) is the back quadrant defined by u 0, v = 0, w 0 and I Q (u0,v0) is the union of the other four quadrants.By the symmetry, we can easily derive the forms of I Q (u0,v0) and U Q (u0,v0) for other possible (u 0 , v 0 ) for which (u 0 , v 0 , 0) lies in the interior of the subset of Q 5 determined by w = 0.
For points (u 0 , v 0 , 0) in Q 5 , we define a map Ψ (u0,v0) from Q 5 to R 2 , that is an isometry on I Q (u0,v0) and collapses U Q (u0,v0) to the line u 0 v = v 0 u, isometrically on each relevant ray through the origin.We shall see that this map is closely related to the expression for the gradient of the squared distance function d 2 Q at points with zero w-coordinate.It is defined by where (u 0 , v 0 ) is the standard norm in R 2 and similarly (u, v, w) = √ u 2 + v 2 + w 2 is the distance of (u, v, w) from (0, 0, 0) in Q 5 , and where It can easily be checked that Ψ (u0,v0) (u 0 , v 0 , 0) = (u 0 , v 0 ) and that Ψ s(u0,v0) = Ψ (u0,v0) for all s > 0. The squared distance function from (u 0 , v 0 , 0) with, for example, u 0 > 0 and v 0 0 to any point (u, v, w) in Q 5 can be expressed explicitly as From this, we deduce that It can be checked that (2.4) holds for all (u 0 , v 0 , 0) in the interior of the subset of Q 5 determined by w = 0.For such (u 0 , v 0 , 0), R 2 may be identified with the tangent space to Q 5 at that point and so equation (2.4) implies that we may regard whose length is the same as the distance between those two points.However, although ) is surjective, it is not injective: the exponential map itself is only defined on the subspace of the tangent space corresponding to I Q (u0,v0) .By (2.1), a direct consequence of (2.4) is the following characterisation of the Fréchet mean of µ, when it is away from the cone point, in terms of the Euclidean means of the random variables , where ξ is a random variable in Q 5 with probability µ.
Lemma 2.1.Let µ be a probability measure on Q 5 such that its Fréchet function is finite.Then, (û, v, 0) = (0, 0, 0) is the Fréchet mean of µ if and only if for all s > 0, this result also implies that (û, v) is the Fréchet mean of µ if and only if it is identical with the Euclidean mean of the random variable Ψ s(û,v) (ξ).
To study the fluctuations of the sample Fréchet means of {ξ l } around the true Fréchet mean (û, v, 0), we first examine, for fixed (u, v, w) ∈ Q 5 , how the corresponding vector Ψ (u0,v0) (u, v, w) changes as (u 0 , v 0 ) changes.For two points (u r , v r , 0) = (0, 0, 0 , the second and third expressions above differ, on the relevant domains, from the fourth by terms which are o( ) , the summand ϕ (u2,v2) (u, v, w) lies in a wedge centred on the origin with edges determined by the vectors −(u 1 , v 1 ) and −(u 2 , v 2 ) and distant (u, v, w) from the origin.On the other hand, the first order Taylor expansion of the vector-valued function where the matrix M (u,v) is given by (2.6) Note that (u, v) M (u,v) is the projection map to the line through the origin orthogonal to (u, v).Thus, for (u 2 , v 2 ) sufficiently close to (u 1 , v 1 ), we have This analysis leads to the following central limit theorem for Fréchet means in Q 5 , away from its cone point.
Proposition 2.2.Let µ be a probability measure on Q 5 with finite Fréchet function and with Fréchet mean (û, v, 0) lying in the interior of the subset of Q 5 determined by w = 0. Also, let {ξ l } be a sequence of iid random variables in Q 5 with probability measure µ and (û n , vn , ŵn ) be the sample Fréchet mean of ξ 1 , where V is the covariance matrix of the random variable Ψ (û,v) (ξ 1 ), and M (u,v) is given by (2.6).
Proof.Since the sequence of the sample Fréchet means (û n , vn , ŵn ) converges a.s. to (û, v, 0) (cf.[20] & [22]), it follows that (û n , vn , ŵn ) lies a.s. in the interior of the subset of Q 5 determined by w = 0 for sufficiently large n.Applying Lemma 2.1 to discrete probability measures on Q 5 it follows that, for n sufficiently large such that ŵn = 0, Hence, for sufficiently large n, (2.8) By (2.7), the second summand on the right hand side of the above expression is (2.9) ) converges to the empty set as n → ∞, by replacing it for sufficiently large n with an appropriately defined cone D where, for a given > 0, D has a sufficiently small angle that we also have that, in distribution and so in probability, On the other hand, {Ψ (û,v) (ξ l )} is a sequence of iid random variables in R 2 with the Euclidean mean (û, v) by (2.5), so we can apply the classical central limit theorem to the first summand on the right hand side of (2.9) to get The required result then follows from (2.9) above.
3 Fréchet means in a top stratum of T 4 The tree space, T n , is the moduli space of labelled n-trees, the moduli or parameters being the lengths of the internal edges.A labelled n-tree with unspecified edge lengths determines a topological type, modulo its leaves and root.Then T n is a stratified space with a stratum for each topological type, such that a topological type with k internal edges determines a stratum with k positive parameters ranging over the points of an open k-dimensional orthant.It is known (cf.[4]) that T n is a CAT (0) space.A brief description of tree space is given in the Appendix and, for a more comprehensive study, we refer readers to papers such as [4], [18], [19] and [21].
The space T 4 can be visualised and described via the link T4 of its cone point, which is the tree whose internal edges all have zero length.The link T4 is the set of trees in T 4 whose edge lengths sum to 1.The entire space T 4 is the infinite cone on this link: for each point in the link there is a semi-infinite line in T 4 through this point to the cone point.This link is a finite graph, namely the Peterson graph, as illustrated in Figure 2: the quadrants of T 4 become edges of T4 and the semi-axes become vertices.As in the general case, each co-dimension one orthant, or semi-axis, is in the boundary of three quadrants so that the graph T4 is trivalent.A pentagon in T4 corresponds to a cycle of five quadrants in T 4 .In T4 , each edge lies in four pentagons and each vertex lies in six of them.The four pentagons containing a given edge are disjoint except for that edge and the neighbouring edges, and their union includes all the vertices and all but two edges of T4 .
In view of the symmetries among the edges and vertices of the graph, we may take the vertices labelled i and j in Figure 2 to be the two vertices of an arbitrarily chosen edge; then i r (j r ), r = 1, 2, will be the two vertices which share an edge with i (j); and finally k rs will be the vertex together with which the vertices i, j, i r , j s form a pentagon in T4 .Following the method of depiction in [4], the first four diagrams in Figure 3 illustrate the four cycles in T 4 of five quadrants that correspond to the four pentagons in the Peterson graph to which the (i, j)-edge belongs.In each of these representations, we indicate the first four semi-axes, and hence the first three quadrants, as lying in a plane and the fifth semi-axis, and hence the remaining two quadrants, orthogonal to that plane.Then, the shaded quadrants of these four 5-cycles give 13 distinct quadrants of T 4 .The two remaining quadrants in T 4 can be described using the further two 5-cycles in which the vertex i (or j) lies, as shown in the last two diagrams of Figure 3. Thus, these 15 shaded quadrants, any two of which have, apart from the cone point, at most a semi-axis in common, together form the entire space T 4 .
Each point in T 4 can be specified by the quadrant in which it lies and its coordinate in that quadrant.Hence, it can be specified by a pair of non-negative numbers (x r , x s ), where r = s are the labels of the vertices of an edge in the Peterson graph.In general the ordering of r and s is unspecified so that both (x r , x s ) and (x s , x r ) represent the same point in T 4 .However, for our purposes, when working in any 5-cycle based on the (i, j)-quadrant as above, we shall take the axes in the cyclic order i, j, j r , k sr , i s , i.This implies that the point (x i , 0) in the i-semi-axis of the (i, j)-quadrant will be represented as (0, x i ) in the (i r , i)-quadrants.We regard the j r -semi-axis as the negative i-semi-axis and the i s -semi-axis as the negative j-semi-axis so that, for example, the point (x ir , x i ) in the (i r , i)-quadrant has coordinate (x i , −x ir ) with respect to the iand j-semi-axes.We also regard the k rs -semi-axis as the orthogonal semi-axis through the origin of the (i, j)-plane.
The metric d T on T 4 is obtained by identifying each quadrant with the principal quadrant in R 2 thus inducing the Euclidean metric on it, and by defining the length of any rectifiable curve in T 4 to be the sum of the lengths of the segments into which it is broken by the axes of the quadrants.Then, analogously to the dichotomy in Q 5 we have, on the one hand, the union of the dark shaded areas in Figure 3, where the geodesic in T 4 to (x i , x j ) in the (closed) (i, j)-quadrant either is a straight linear segment or becomes one when the relevant quadrant is 'folded down' and, on the other hand, the union of the light shaded areas, the 'umbral' set U (xi,xj ) , of points from which the geodesic in T 4 to (x i , x j ) passes through the cone point.Note that U (0,0) = ∅ and that U (sxi,sxj ) = U (xi,xj ) for s > 0. Note also that, for each non-cone point (x i , x j ), U (xi,xj ) accounts for 2/5 of the total area of T 4 .
Note that ψ ij is continuous, so measurable, and that ψ ij depends on the chosen (i, j)quadrant, but is independent of points in that chosen quadrant.In terms of the mapping where t 1 is a point in the (i, j)-quadrant and t 2 is arbitrary.However, if t 1 is not in the (i, j)-quadrant, this relation does not necessarily hold.
Figure 3: A decomposition of T 4 with respect to the (i, j)-quadrant: the fifteen shaded quadrants that form T 4 ; geodesics from (x i , x j ) to the light shaded areas pass through the cone point.
We now consider central limit theorems for Fréchet means on the tree space T 4 .Hence, for the remainder of the paper, we shall assume that µ is a probability measure on T 4 such that its Fréchet function is finite and that {ξ l } is a sequence of iid random variables in T 4 with probability measure µ.
In this section, we consider the case that the Fréchet mean ξ of µ lies in a top stratum.For this, by the symmetry of T 4 , we may without loss of generality assume that ξ lies in the interior of the (i, j)-quadrant with coordinates (x i , xj ), so that both xi and xj are positive.Then, since the squared distance from a fixed point in T 4 is differentiable at (x i , xj ), it follows from a similar argument to that used in the previous section that (x i , xj ) is characterised by the analogous condition to that given by (2.1) in Q 5 .Note that, similarly to the case for Q 5 , although 1  2 grad is a surjective map from T 4 to R 2 , it is not injective and we may regard it as a generalised log map for T 4 at (x i , xj ).Hence, by defining we have that (x i , xj ), where both xi and xj are positive, is the Fréchet mean of ξ 1 in T 4 if and only if This, as for Lemma 2.1 in the case of Q 5 , gives the relationship between the Fréchet mean of ξ 1 in T 4 , when that mean lies in a top stratum, and the Euclidean means of the random variables Φ • (ξ 1 ).The expression for d T can be obtained from the expression (2.3) for d Q using the map ψ ij defined by (3.1) and the relationship between them given by (3.2).In particular, for (x i , x j ) in the (i, j)-quadrant of T 4 , we have ψ ij (x i , x j ) = (x i , x j , 0) and, with a certain abuse of notation, we may identify ψ ij (x i , x j ) with (x i , x j ) when (x i , x j ) is in the (i, j)-quadrant of T 4 .Then, a direct computation shows that Φ (xi,xj ) defined by (3.3) is the same as Ψ (xi,xj ) composed with ψ ij : where Ψ is given by (2.2).When ξ lies in a top stratum of T 4 , this relationship between Φ and Ψ, together with Proposition 2.2, gives the central limit theorem for Fréchet means as follows.Since the sample Fréchet means ξn of {ξ l } converge to ξ a.s., ξn will lie in the interior of the (i, j)-quadrant when n is sufficiently large.Hence, in the following without loss of generality, we assume that that is the case for all n, so that the sample means ξn have coordinates ξn = (x n i , xn j ) with both xn i and xn j positive.
Theorem 3.1.Let µ be a probability measure on T 4 with finite Fréchet function and with Fréchet mean ξ = (x i , xj ) lying in the interior of the (i, j)-quadrant.Also, let {ξ l } be a sequence of iid random variables in T 4 with probability measure µ and ξn be the sample Fréchet mean of ξ 1 , where V is the covariance matrix of the random variable Φ (xi,xj ) (ξ 1 ), and M (u,v) is given by (2.6).
Proof.By (3.4), the sample Fréchet means satisfy However, the point ψ ij (x r , x s ) in the interior of the subset of Q 5 determined by w = 0 is the Fréchet mean in Q 5 of ψ ij (ξ 1 ) if and only if Comparing the second equality with (2.8) and noting that U Q (xi,xj ) = ψ ij (U (xi,xj ) ) and that ξ 1 = ψ ij (ξ 1 ) , the required result then follows from Proposition 2.2.
Recalling that U (xi,xj ) counts for 2/5 of the area of T 4 and since Ψ (xi,xj ) collapses U Q (xi,xj ) to the line xi v = xj u, isometrically on each relevant ray through the origin, it follows that the distribution of Φ (xi,xj ) (ξ 1 ) has positive mass on the half line xi v = xj u with sign(u) = −sign(x i ), as long as µ(U (xi,xj ) ) > 0. In this case, the distribution of Φ (xi,xj ) (ξ 1 ) is singular.Note also the role played by U (xi,xj ) in the expression for the matrix A defined by (3.6).
Although Theorem 3.1 concerns the case that Fréchet means lie in the manifold part of T 4 and, as noted earlier, Φ (xi,xj ) (•)−(x i , x j ) plays a similar role to that of the log maps for Riemannian manifolds, neither the result nor the proof of Theorem 3.1 is a special case of [16], as [16] deals with complete and simply connected Riemannian manifolds and as Φ (xi,xj ) is generally not a C 2 -injective map on the support of µ except in very special circumstances.For the case that Φ (xi,xj ) is an injective map on the support of µ, the support of µ would be so restricted that one would be able to deduce the result directly from the classical central limit theorem for random variables in Euclidean space.The result of Theorem 3.1 also differs from that of [3] since, in addition to the fact of its being neither C 2 nor injective, the map Φ (xi,xj ) also depends on the point (x i , x j ).

Fréchet means in a co-dimension one stratum of T 4
We now turn to consider the case where the Fréchet mean ξ of the probability measure µ on T 4 lies in a co-dimension one stratum.Without loss of generality, we assume that it lies on the open i-semi-axis, the co-dimension one stratum corresponding to the i-vertex in the Peterson graph.Then, in terms of the coordinate system on T 4 that we adopt, there is more than one way to represent ξ: either as (x i , 0) in the boundary of the (i, j)-quadrant or as (0, xi ) in the boundary of the (i r , i)-quadrant for r = 1, 2, where xi > 0. To indicate explicitly the quadrant in which we are considering ξ to lie, we shall write (x i , 0) as (x i , 0 j ) when it is to be regarded as a point in the (i, j)-quadrant and, similarly, write (0, xi ) as (0 ir , xi ) when it is to be regarded as a point in the (i r , i)-quadrant.
Note that, when ξ = (x i , xj ) lies on the open i-semi-axis, we can regard the union of the three half planes that are bounded by the full i-axis and contain, respectively, the (i, j)-, (i 1 , i)and (i 2 , i)-quadrants in which ξ lies as 'the tangent space' of T 4 at ξ.However, since it is no longer true that any neighbourhood of ξ is a manifold, the criterion equivalent to (2.1) for a point to be the Fréchet mean of µ cannot be applied.Instead, it may be characterised by requiring the non-negativity of the directional derivatives, along the j-, i 1 -and i 2 -semi-axes, of the Fréchet function for µ, as given by (1.1), together with its derivative along the i-semi-axis being zero, at (x i , 0 j ).By continuity, the directional derivative with respect to the first variable of 1 2 d T (t, (x r , x s )) 2 at t = (x i , 0 j ) along the j direction has the same expression as Ψ ψij (xi,0j ) • ψ ij − (x i , 0 j ).Since (i, j) is arbitrary, there are similar expressions for the directional derivatives along the i r , r = 1, 2, directions.Thus, using the relationship (3.5) to extend the definition of Φ (xi,xj ) to any point (x i , xj ) lying in the closed (i, j)-quadrant, the requirement that these three directional derivatives be non-negative may be re-expressed as In general, if (x i , 0 j ) with xi > 0 is the Fréchet mean of µ, (x i , 0, 0) is not necessarily the Fréchet mean of µ•ψ −1 ij , since away from its cone point, Q 5 is a Riemannian manifold so that the criterion (2.1) holds there.
The conditions (4.2) also have the following consequence for the behaviour of the sample Fréchet mean ξn of ξ 1 , • • • , ξ n , analogous to the result of Theorem 4.3(1) in [13] for open book decompositions.
Lemma 4.3.Let ξ = (x i , 0 j ) on the open i-semi-axis be the Fréchet mean of µ and assume that the first of the inequalities (4.2) is strict.If {ξ l } is a sequence of iid random variables in T 4 with probability measure µ then, for all sufficiently large n, the sample Fréchet mean ξn cannot lie in the interior of the (i, j)-quadrant. Proof.
The assumption that the first of the inequalities in (4.2) is strict implies that the second coordinate of E[Ψ ψij (xi,0j ) (ψ ij (ξ 1 ))] is negative.Then, by the law of large numbers, the second coordinate of ) is also negative when n is sufficiently large.On the other hand, since ξn converges to ξ = (x i , 0 j ), the continuity of ψ ij implies that ψ ij ( ξn ) will be close to (x i , 0, 0) for large n.In particular, for large n, the first coordinate of ψ ij ( ξn ) is positive and the third zero.Thus, it follows from (2.7) that, for large n and for 1 l n, } .
This implies that the 2nd coordinate of } .
(4.4) Suppose, if possible, that ξn lies in the interior of the (i, j)-quadrant with coordinates (x n i , xn j ), so that xn j > 0. By Corollary 4.2(a) it would follow that ( ξn , 0) is the sample Fréchet mean of ψ ij (ξ 1 ), • • • , ψ ij (ξ n ).Then, the left hand side of (4.4) would be equal to xn j which is positive.However, the right hand side of (4.4) is negative by the negativity of its first term on account of the given assumption.This contradiction shows that, for all sufficiently large n, ξn cannot lie in the interior of the (i, j)-quadrant.
Analogously, if the second, respectively third, inequality in (4.2) is strict then, for all sufficiently large n, ξn cannot lie in the interior of the (i 1 , i)-quadrant, respectively the (i 2 , i)-quadrant.Similar generalisations will hold for the following theorem on the form of the central limit theorem when ξ lies in a co-dimension one stratum, where Φ is the extension to the closed quadrant that occurs in (4.1).Theorem 4.4.Let µ be a probability measure on T 4 with finite Fréchet function and with Fréchet mean ξ = (x i , 0) lying on the open i-semi-axis.Also, let {ξ l } be a sequence of iid random variables in T 4 with probability measure µ and ξn be the sample Fréchet mean of ξ 1 , • • • , ξ n .
(a) If all three inequalities in (4.2) are strict then, for all sufficiently large n, ξn will lie on the i-semi-axis and the sequence √ n{x n i − xi } of the first coordinates of √ n{ ξn − ξ} converges in distribution to N (0, σ 2 ) as n → ∞, where σ 2 is the variance of the first coordinate of the Euclidean random vector Φ (xi,0j ) (ξ 1 ).
(b) If the first inequality in (4.2) is an equality and the other two are strict then, for all sufficiently large n, ξn will lie in the (i, j)-quadrant and where (η 1 , η 2 ) ∼ N (0, A V A), V is the covariance matrix of Φ (xi,0j ) (ξ 1 ) and A is as in (3.6) with xj = 0, and where ' ≡' is understood to hold for all sufficiently large n.
(c) If the first two inequalities in (4.2) are equalities and the third is strict then, for all sufficiently large n, ξn will lie either in the (i, j)-quadrant or in the (i 1 , i)-quadrant and the limiting distribution of √ n{ ξn − ξ}, as n → ∞, takes the same form as that given in Theorem 3.1 with xj = 0, where the coordinates of ξn are taken as (x n i , xn j ), respectively (x n i , −x n i1 ), if ξn is in the (i, j)-quadrant, respectively the (i 1 , i)-quadrant.
(d) If all the equalities in (4.2) are actually equalities, then we have the same result as in (a).
Proof.(a) By Lemma 4.3, when n is sufficiently large, ξn must lie on the i-semi-axis so that it has coordinates (x n i , 0 j ) (equivalently (0 ir , xn i )).
Then a modification of the argument of Section 2 to restrict it to the first coordinates of {ψ ij (ξ l )} gives the required limiting distribution of √ n{x n i − xi }.(b) By Corollary 4.2(c), (x i , 0, 0) is the Fréchet mean of ψ ij (ξ 1 ).On the other hand, we deduce from the assumed strict inequalities and Lemma 4.3 that, when n is sufficiently large, ξn can only lie in the (closed) (i, j)-quadrant, so that it has coordinates ξn = (x n i , xn j ) where we may assume that xn i > 0.
Hence, Proposition 2.2, the central limit theorem for the sample Fréchet means of {ψ ij (ξ l )}, gives the central limit theorem for the sample Fréchet means of {ξ l }.
(c) In this case, by Corollary 4.2(c), ( ξ, 0) is the Fréchet mean both of ψ ij (ξ 1 ) and of ψ i1,i (ξ 1 ) in Q 5 .So that Moreover, the integral I i2 becomes zero and so, since the integrand is non-negative, µ(C) = 0 where C, the domain of integration of I i2 , is the union of the (i 2 , i)-, (k 21 , i 2 )and (k 22 , i 2 )-quadrants with the i-, k 21 -and k 22 -semi-axes removed.It is now more convenient to represent the union of the (i, j)and (i 1 , i)-quadrants by coordinates in the (x, y)-half-plane with x 0. For this, we map: where R is the rotation matrix 0 −1 1 0 .Similarly we define maps Φ(x,y) to accord with , the map Φ is indeed a.s.well defined for points (x i , 0).Under this new coordinate system we have, in particular, that ξ = (x i , 0 j ) = (x i , 0 i1 ) and that By Lemma 4.3, the given assumption also implies that, for sufficiently large n, ξn will a.s.lie either in the (i, j)-quadrant or in the (i 1 , i)-quadrant.If ξn lies in the interior of the (i, j)-quadrant with coordinates ξn = (x n i , xn j ), then xn j > 0 and and, if ξn lies in the interior of the (i 1 , i)-quadrant with (original) coordinates ξn = (x n i1 , xn i ), then If ξn lies on the open i-semi-axis with coordinates (x n i , 0) then since, locally there, the support of µ is diffeomorphic with R 2 , we also have Recalling that, under the new coordinate system defined by (4.5), we have by (4.6), (4.7), (4.8) and (4.9) that, in terms of the new coordinates, Hence, a similar argument to that of the proof for Theorem 3.1, we see that the central limit theorem now takes the same form as in that theorem with xj = 0.
(d) Noting that all integrands in (4.2) are non-negative, the three equalities will together imply that µ must be concentrated on the union of the i-semi-axis and U (xi,0j ) .Then, µ must have positive mass on the i-semi-axis.Otherwise, it would contradict (4.3), as its left hand side is positive by the assumption and its right hand side would become negative.This results in Φ (xi,0j ) (ξ 1 ) being a one-dimensional random variable on R with mean (x i , 0 j ).Hence, the measure induced on R 2 from µ by Φ (xi,xj ) for (x i , x j ) in the (i, j)-quadrant has support contained in the half plane with non-positive second coordinate.Similarly, for r = 1, 2, the measure induced on R 2 from µ by Φ (xi r ,xi) for (x ir , x i ) in the (i r , i)-quadrant has support contained in the half plane with nonpositive first coordinate.This constraint on µ implies that ξn lies on the i-semi-axis for all sufficiently large n.Otherwise, if ξn = (x n i , xn j ) lies in the interior of the (i, j)quadrant, say, then, on the one hand, xn j > 0 and, on the other hand, on account of the features of µ, we must have Thus, the argument for (a) implies that, when the inequalities in (4.2) are all equalities, the central limit theorem for the sample Fréchet means takes the same form as that when the three inequalities are all strict.
5 Fréchet means at the cone point of T 4 The cone point o being the Fréchet mean of µ is equivalent to the fact that, for any i, j and any non-cone point (x i , x j ) in the (i, j)-quadrant, we have which is equivalent to all possible directional derivatives of the Fréchet function for µ being non-negative at the cone point.Recalling that Ψ (u,v) = Ψ s(u,v) , s > 0, for Ψ defined by (2.2), it then follows from the relationship between Φ and Ψ that Φ (xi,xj ) = Φ s(xi,xj ) for any s > 0, Φ here being the extension to the closed (i, j)-quadrant defined at the beginning of the previous section.Using this invariance, it is more transparent to write Φ (xi,xj ) = Φ θij in studying the limiting behaviour of the sample Fréchet means ξn when ξ is at the cone point, where θ ij ∈ [0, π/2] is determined by tan θ ij = x j /x i .With this new notation, the above condition for the cone point o to be the Fréchet mean of µ is equivalent to the fact that, for any (i, j)-quadrant and any θ ij ∈ [0, π/2], 0. (5.1) For θ ij ∈ (0, π/2), the condition (5.1) implies that, if both of its coordinates are nonnegative, then T4 Φ θij (ξ) dµ(ξ) must be at the origin.On the other hand, if at least one of the coordinates of T4 Φ θij (ξ) dµ(ξ) is negative, it follows from (3.4) that, when n is sufficiently large, the sample Fréchet mean ξn of ξ 1 , • • • , ξ n will not be on the half line in the (i, j)-quadrant determined by θ ij .Note that, when θ ij varies, the distribution of Φ θij (ξ) generally varies too.
Fix an arbitrary (i, j)-quadrant and let (5.2) Lemma 5.1.The restriction of Θ ij to (0, π/2) either is the empty set, or forms an interval, where the latter includes the case of a single point.
By the classical central limit theorem there is an n 0 such that, for all n > n 0 and for all θ For any given (5.4) and (5.5) hold then, for n > n 0 , we have, by applying (2.7) and using the relationship between Ψ and Φ with (x (5.6) However, if there were s > 0 and θ ij ∈ (0, π/2) such that then by (5.5) we would have ).Thus, it follows from (3.4) that, for n > n 0 , ξn does not lie on the open half line determined by any θ ij in (0, π/2).
For θ ij = 0 (respectively π/2), the condition (5.1) implies that the first (respectively the second) coordinate of T4 Φ θij (ξ) dµ(ξ) is non-positive and, since Θ ij = ∅, it must be negative.The result of Lemma 4.1 then implies that, when n is sufficiently large, ξn will not lie the boundary of the (i, j)-quadrant determined by x j = 0 (respectively ij ], then the above argument shows that, for any > 0, there is an n 0 such that when n > n 0 if ξn is in the (i, j)-quadrant, it must lie in the cone in spanned by the cone point and angles [θ The arbitrariness of means that, given it lies in the (i, j)-quadrant, the probability that ξn lies in the cone C ij tends to one as n → ∞.
On the other hand, the proof of Lemma 5.1 shows that, for all θ ij ∈ Θ ij , the random variables Φ θij (ξ 1 ) have the same distribution.Thus, as before, given that ξn is in C Moreover, the definition of Θ ij implies that the Euclidean mean of Φ θij (ξ 1 ) is at the origin for all θ ij ∈ Θ ij ∩ (0, π/2).Hence, the classical central limit theorem gives that, for any Borel set B ⊂ R 2 , The required result follows by noting that Note that the above proof shows that, if Θ ij ∩ (0, π/2) contains only one single angle θ ij , then Φ θij (ξ) must be a one-dimensional random variable and the limiting distribution must be one-dimensional Gaussian.The following result is a direct consequence of that of Theorem 5.2(a).Corollary 5.3.Assume that the Fréchet mean ξ of µ is at the cone point.If Θ ij = ∅ for all (i, j)-quadrants of T 4 , then the sample Fréchet mean ξn will be at the cone point for all sufficiently large n.
We note finally that the proof of the central limit theorem when the Fréchet mean of a probability measure on T 4 is at the cone point can easily be simplified to obtain similar results for the central limit theorem when the Fréchet mean of a probability measure on Q 5 is at the cone point of Q 5 .

Appendix: The tree space
A tree is a contractible graph, that is, a connected graph with no circuits.An n-tree has n + 1 vertices of degree 1, one of which is distinguished as its root, the others being termed leaves.We are interested in labelled trees for which the names, generally a, b, c, • • • , of the leaves matter.However, since all our trees are labelled, we shall generally omit that adjective.The remaining internal vertices all have degree at least 3.If both vertices of an edge are internal, then that edge is also called internal.A tree in which all internal vertices have degree 3 is called a binary tree and such an n-tree has n − 1 internal vertices and n − 2 internal edges, giving a total of 2n vertices and 2n − 1 edges.Note that it does not matter how the tree is oriented in the plane.
The tree space, T n , is the moduli space of labelled n-trees, the moduli or parameters being the lengths of the internal edges.A labelled n-tree with unspecified edge lengths determines a topological type, modulo its leaves and root, that we shall refer to as an n-tree-type, or just n-type.Then T n is a stratified space with a stratum for each n-type, such that an n-type with k internal edges determines a stratum with k positive parameters ranging over the points of an open k-dimensional orthant.The top dimensional strata correspond to the binary tree-types and, if β n is the number of binary labelled n-tree-types, then any edge, internal or not, of a binary labelled (n − 1)-tree may be replaced by three edges, an internal vertex and a new leaf n, to give a well-defined binary labelled n-tree:
The procedure being reversible, we see that If we also specify the lengths of the non-internal edges, those with leaves or the root attached, we obtain the 'full tree space', which is the product T n ×(R + ) n .In this paper, we shall only consider T n since a central limit theorem on the full tree space may be derived from the corresponding one on T n .Such an n-tree-type can be represented as a particular product of the leaves that, assuming commutativity, is determined by an appropriate sequence of associations.Thus, The boundary relation in the stratification of T n corresponds to a parameter becoming zero.That is equivalent to removing a bracket from the associative pattern and corresponds to the two vertices of an edge in the corresponding n-type coalescing to form a single vertex.For the top stratum, that new vertex will have degree 4. Such a vertex may be resolved in three ways to re-establish a binary tree: where A, B, C are subtrees, possibly just leaves, and X is a subtree containing the root and any leaves that are not involved in A, B or C. Thus each such stratum of codimension one lies on the boundary of three top dimensional strata.On the other hand each top dimensional stratum has n − 2, the number of internal edges of the binary n-type, such co-dimension one strata forming its boundary.Thus there are n−2 3 (2n − 3)!! co-dimension one strata.At the other extreme, each T n has a single zero dimensional stratum, a vertex representing the tree with no internal edges and a single internal vertex of degree n + 1.The one dimensional strata correspond to just one bracket enclosing two or more, but not all, leaves so that there are 2 n − n − 2 of them.Note that two such strata may belong to the same tree-type if, and only if, the corresponding brackets are not linked: either one includes the other or their contents are disjoint.We shall refer to the top dimensional strata as cells, specifically (n − 2)-cells, and the codimension one strata as their faces.We shall often refer to the one dimensional strata as semi-axes, since that is indeed what they represent in T n .Note that T 3 is special in that the 1-cells are the three one-dimensional strata and the only face is the unique zero-dimensional stratum.
An alternative notation for a tree-type arises from the observation that an internal edge partitions the full set of leaves, including the root, into two subsets and a treetype may be specified by the set of such partitions, or splits, that correspond to its internal edges.This notation is more apposite when one is not distinguishing the root.The correlation is that the bracket that determines a semi-axis in the former notation contains the members of the subset that does not contain the root in the split notation.
Following [4], we introduce a metric d T on T n by identifying each (n − 2)-cell with the principal orthant in R n−2 with the Euclidean metric, and defining the length of any rectifiable curve in T n to be the sum of the lengths of the segments into which it is broken by the faces of the cells.Since each orthant is a cone with vertex at the origin,  avoid confusion with vertices in trees, we shall refer to as the cone point.Given trees t 1 , t 2 , there is always a path from t 1 to t 2 comprising the ray from t 1 to the cone point followed by the ray from the cone point to t 2 .In a good number of cases -40% in T 4this will be the geodesic from t 1 to t 2 .Obviously, however, this is not the geodesic when t 1 and t 2 lie in the same cell.Moreover, if they lie in (n − 2)-cells σ 1 and σ 2 that share a common face f 12 , then the union of these cells may be isometrically embedded in R n−2 in the obvious way and the geodesic in T n will correspond to the straight line in R n−2 .Similar considerations give rise to further examples in T n .
When we restrict attention to T 4 , just one more feature occurs as follows.In T 4 the 2-cells are now plane quadrants and their faces are the two semi-axes.Given a chain of three quadrants with consecutive quadrants sharing a semi-axis, their union may be embedded in R 2 by omitting, say, its +− quadrant.If the geodesic starts in the interior of the first quadrant, then it can be realised as a straight linear segment to some points in the third quadrant, but must pass through the cone point for the rest.Any such sequence of three quadrants lies in a unique cycle of five quadrants in T 4 .Thus, to describe the geodesics in T 4 in general, it will be convenient to embed such 5-cycles in R 3 as the space Q 5 .We now describe these 5-cycles, and how they all fit together in T 4 .A quadrant in T 4 will correspond to a 4-type with bracket notation of the form ((ab)c)d or (ab)(cd).The quadrant ((ab)c)d) has the two semi-axes (ab)cd and (abc)d.This first semi-axis is shared by the quadrants ((ab)d)c and (ab)(cd), while the second is shared by (a(bc))d and ((ac)b)d.Choosing the sequence ((ab)d)c, ((ab)c)d, (a(bc))d, the 'free' semi-axes at each end, which are not part of the quadrant ((ab)c)d, are (abd)c and (bc)ad.The only edge with which these are both compatible is (ad)bc, giving the other two quadrants in the 5-cycle.The same is true for any sequence of three contiguous quadrants in T 4 : there is a unique choice of two further quadrants to give a cycle of five quadrants in which cyclically consecutive members share a semi-axis.However, there are two independent choices for each of the quadrants cobounding, i.e. sharing a semi-axis with, the central of the three initial quadrants.This implies that each quadrant will lie in four such 5cycles.It is convenient to visualise and describe these details of the structure of T 4 via the Peterson graph which is the link T4 of its cone point, as illustrated in Figure 4.The labelling given there is not unique, but all possible labellings are equivalent.

( 4 . 2 )
Hence, we have the following result to characterise a point in the open i-semi-axis of T 4 as the Fréchet mean of µ.

FromCorollary 4 . 2 .
the proof of Theorem 3.1 and the result of Lemma 4.1, we obtain the following relationship between the Fréchet means of µ in T 4 and of µ • ψ −1 ij in Q 5 .Note that, if ξ is a random variable in T 4 with probability measure µ, then ψ ij (ξ) is a random variable in Q 5 with probability measure µ • ψ −1 ij .Assume that the Fréchet mean of µ lies in the (i, j)-quadrant.(a)The Fréchet mean of µ is (x i , xj ) with xi > 0 and xj > 0 if and only if

Figure 4 :
Figure 4: The structure of T 4 represented on the Peterson graph follows from(3.4)and (3.5) that ξ 1 is a random variable in T 4 with Fréchet mean (x i , xj ), where both xi and xj are positive, if and only if (x i , xj , 0) is the Fréchet mean of ψ ij (ξ 1 ) in Q 5 .Hence, we can re-express (3.7) as Then, by Corollary 4.2, xn However, xn j > 0, if and only if both of the first two coordinates of ψ ij ( ξn ) are positive and by Corollary 4.2(a), in that case, (x n i , xn i is the first coordinate of the sample Fréchet mean ofψ ij (ξ 1 ), • • • , ψ ij (ξ n ).j , 0) is the sample Fréchet mean of ψ ij (ξ 1 ), • • • , ψ ij (ξ n ).Thus, xn j is zero if and only if the second coordinate of the sample Fréchet mean of