On the power of axial tests of uniformity on spheres

Testing uniformity on the $p$-dimensional unit sphere is arguably the most fundamental problem in directional statistics. In this paper, we consider this problem in the framework of axial data, that is, under the assumption that the $n$ observations at hand are randomly drawn from a distribution that charges antipodal regions equally. More precisely, we focus on axial, rotationally symmetric, alternatives and first address the problem under which the direction $\theta$ of the corresponding symmetry axis is specified. In this setup, we obtain Le Cam optimal tests of uniformity, that are based on the sample covariance matrix (unlike their non-axial analogs, that are based on the sample average). For the more important unspecified-$\theta$ problem, some classical tests are available in the literature, but virtually nothing is known on their non-null behavior. We therefore study the non-null behavior of the celebrated Bingham test and of other tests that exploit the single-spiked nature of the considered alternatives. We perform Monte Carlo exercises to investigate the finite-sample behavior of our tests and to show their agreement with our asymptotic results.


Introduction
Directional statistics are concerned with data taking values on the unit hypersphere S p−1 := {x ∈ R p : x 2 := x x = 1} of R p . Classical applications, that most often relate to the circular case (p = 2) or spherical one (p = 3), involve wind and animal migration data or belong to fields such as geology, paleomagnetism, or cosmology. We refer, e.g., to Fisher, Lewis and Embleton (1987), Mardia and Jupp (2000), and Ley and Verdebout (2017) for book-length treatments of the topic and for further applications.
Arguably the most fundamental problem in directional statistics is the problem of testing for uniformity, which, for a random sample X 1 , . . . , X n at hand, consists in testing the null hypothesis that the observations are sampled from the uniform distribution over S p−1 . This is a very classical problem in multivariate analysis that can be traced back to Bernoulli (1735). As explained in the review paper García-Portugués and Verdebout (2018), the topic has recently received a lot of attention: to cite only a few contributions, Jupp (2008) proposed data-driven Sobolev tests, Cuesta-Albertos, Cuevas and Fraiman (2009) and García-Portugués, Navarro-Esteban and Cuesta-Albertos (2019) proposed tests based on random projections, Lacour and Pham Ngoc (2014) studied the problem for noisy data, Paindaveine and Verdebout (2016) obtained the high-dimensional limiting behavior of some classical test statistics under the null hypothesis while García-Portugués, Paindaveine and Verdebout (2019) transformed some uniformity tests into tests of rotational symmetry.
A classical uniformity test dates back to Rayleigh (1919) and rejects the null hypothesis for large values of X n , whereX n = 1 n n i=1 X i is the sample mean of the observations. This test will detect only alternatives whose mean vectors are non-zero, hence is therefore typically used when possible deviations from uniformity are suspected to be asymmetric. In particular, the Rayleigh test will show no power when the common distribution of the X i 's is an antipodally symmetric distribution on the sphere, that is, when this distribution attributes the same probability to antipodal regions. Far from being the exception, such antipodally symmetric distributions are actually those that need be considered when practitioners are facing axial data, that is, when one does not observe genuine locations on the sphere but rather axes (a typical example of axial data relates to the directions of optical axes in quartz crystals; see, e.g., Mardia and Jupp, 2000). Models and inference for axial data have been considered a lot in the literature: to cite a few, Tyler (1987) and more recently Paindaveine, Remy and Verdebout (2018) considered inference on the distribution of the spatial sign of a Gaussian vector, Watson (1965), Bijral, Breitenbach and Grudic (2007) and Sra and Karp (2013) considered inference for Watson distributions (see the next section), Dryden (2005) obtained distributions on high-dimensional spheres while Anderson and Stephens (1972), Bingham (1974) and Jupp (2001) considered uniformity tests against antipodally symmetric alternatives. Now, while the literature offers both axial and non-axial tests of uniformity, axial procedures unfortunately remain much less well understood than their non-axial counterparts, particularly so when it comes to their non-null behaviors. Strong results have been obtained for non-axial tests of uniformity regarding their asymptotic power under suitable local alternatives to uniformity and even regarding their optimality (we refer to Cutting, Paindaveine and Verdebout, 2017 and to the references therein), but virtually nothing is known in that direction for axial tests of uniformity. This provides the main motivation for the present work, that intends to fill an important gap by studying the non-null behavior of some classical (and less classical) axial tests of uniformity. Quite naturally, we will do so in the semiparametric distributional framework that has been classically considered for non-axial tests of uniformity, namely the framework of rotationally symmetric distributions indexed by a finite-dimensional parameter (κ, θ θ θ) ∈ R + ×S p−1 and an infinite-dimensional parameter f ∈ F (a family of functions we define in the next section). Within this semiparametric model, the null hypothesis of uniformity takes the form H 0 : κ = 0. In this paper, we first derive the shape of tests that are locally and asymptotically optimal under any f ∈ F for the specified-θ θ θ problem. We then focus on the unspecified-θ θ θ problem and discuss its connection with the specified-θ θ θ problem. We derive the limiting behavior, under sequences of contiguous alternatives (for any f ∈ F), of the tests provided in Anderson and Stephens (1972) and Bingham (1974). Doing so, we obtain in particular the limiting behavior, under local alternatives, of the extreme eigenvalues of the spatial sign covariance matrix, which is a result of independent interest; see Dürre, Tyler and Vogel (2016) and the references therein for a recent study of these eigenvalues.
The outline and contribution of the paper are as follows. In Section 2, we introduce the class of (rotationally symmetric) alternatives to uniformity that will be considered in this work, which deviate from uniformity along a direction θ θ θ and have a severity controlled by a concentration parameter κ. In this framework, we identify the sequences (κ n ) that make the corresponding sequences of alternatives contiguous to the null hypothesis of uniformity. In Section 3, we tackle the problem of testing uniformity under specified θ θ θ and show that the resulting model is locally asymptotically normal (LAN). We define the resulting optimal tests of uniformity and determine their asymptotic powers under contiguous alternatives. In Section 4, we turn to the unspecified-θ θ θ problem, we show that our LAN result naturally leads to the Bingham (1974) test of uniformity, and we study the limiting behavior of this test under contiguous alternatives. In Section 5, we turn our attention to tests that take into account the "single-spiked" structure of the considered alternatives. We characterize the asymptotic behavior of these tests both under the null hypothesis and under sequences of contiguous alternatives. While all results above are confirmed by suitable numerical exercises in Sections 3-5, we specifically conduct, in Section 6, Monte-Carlo simulations in order to compare the finite-sample powers of the various tests. We provide final comments and perspectives for future research in Section 7. Finally, an appendix contains all proofs.

Axial rotationally symmetric distributions
In this section, we set the notation and describe the class of axial distributions we will use as alternatives to uniformity. A celebrated class of axial distributions on the sphere is the one that collects the Watson distributions, which admit a density (throughout, densities over S p−1 are with respect to the surface area measure) of the form where θ θ θ belongs to S p−1 , κ is a real number, and where the value of the normalizing constant c p,κ can be obtained from (3); as usual, Γ denotes Euler's Gamma function. Since this density is a symmetric function of x, it attributes the same probability to antipodal regions on the sphere, hence is indeed suitable for axial data. The Watson distribution is rotationally symmetric about θ θ θ, in the sense that if X has density (1), then OX and X share the same distribution for any p × p orthogonal matrix O such that Oθ θ θ = θ θ θ; consequently, θ θ θ will be considered a location parameter. The Watson distributions are the rotationally symmetric (or single-spiked) Bingham (1974) distributions. The parameter κ is a concentration parameter: the larger |κ|, the more the probability mass will be concentrated-symmetrically about the poles ±θ θ θ for positive values of κ (bipolar case) or symmetrically about the hyperspherical equator S ⊥ θ θ θ := {x ∈ S p−1 : x θ θ θ = 0} for negative values of κ (girdle case). Of course, the value κ = 0 corresponds to the uniform distribution over the sphere.
In this paper, we consider a natural semiparametric extension of the class of Watson distributions, namely the class of axial distributions admitting a density of the form where θ θ θ and κ are as in (1), f belongs to the class of functions F := {f : R → R + : f monotone increasing, twice differentiable at 0, with f (0) = f (0) = 1}, and where The parameter κ is still a concentration parameter, that shares the same interpretation as for Watson distributions. The restriction to F above is made for identifiability purposes. If κ = 0, then f and the pair {±θ θ θ} are identifiable, but θ θ θ itself is not (which is natural for axial distributions). For κ = 0, the uniform distribution over the sphere is obtained (irrespective of f ), in which case the location parameter θ θ θ is unidentifiable, even up to a sign. The corresponding normalizing constant is c p := lim κ→0 c p,κ,f . The distribution associated with (2) is rotationally symmetric about θ θ θ, and if X is a random vector with this distribution, then X θ θ θ has density s → c p,κ,f (1 − s 2 ) (p−3)/2 f (κs 2 )I[|s| ≤ 1], which explains the expression (3). In the present axial case, this density is of course symmetric with respect to zero. The semiparametric class of distributions just introduced will be used in the paper as alternatives to uniformity on the sphere. We will consider the following hypotheses and asymptotic scenarios. For θ θ θ ∈ S p−1 , a sequence (κ n ) in R 0 and f ∈ F, we will denote as P (n) θ θ θ,κn,f the hypothesis under which X n1 , . . . , X nn form a random sample from the density x → c p,κn,f f (κ n (x θ θ θ) 2 ) over S p−1 . This triangular array framework will allow us to consider local alternatives, associated with suitable sequences (κ n ) converging to zero. The null hypothesis of uniformity will be denoted as P (n) 0 . The sequence of hypotheses P (n) θ θ θ,κn,f determines a sequence of alternatives to uniformity: the larger |κ n |, the more severe the corresponding alternative, whereas the sign of κ n determines the type of alternatives considered, i.e., bipolar (for κ n > 0) or girdle-like (for κ n < 0). At places, it will be of interest to compare our results with those obtained in the non-axial case, that is, in the case where X n1 , . . . , X nn have a common density proportional to f (κ n x θ θ θ) still with f monotone increasing (rather than f (κ n (x θ θ θ) 2 )).
Our first result identifies the sequences of alternatives P (n) θ θ θ,κn,f that are contiguous to the sequence of null hypotheses P (n) 0 .
Theorem 2.1. Fix p ∈ {2, 3, . . .}, θ θ θ ∈ S p−1 , and f ∈ F. Let (κ n ) be a sequence in R 0 that is O(1/ √ n). Then, the sequence of alternative hypotheses P (n) θ θ θ,κn,f and the sequence of null hypotheses P θ θ θ,κn,f } can be consistent. Actually, it will follow from Theorem 3.1 in the next section that 1/ √ n is the contiguity rate, in the sense that if κ n = τ / √ n (for some non-zero real constant τ ), then there exist tests for H 0n : {P θ θ θ,κn,f } showing non-trivial asymptotic powers (that is, asymptotic powers in (α, 1), where α denotes the nominal level). The contiguity rate in the axial case thus coincides with the one obtained in the non-axial case; see Theorems 2.1 and 3.1 in Cutting, Paindaveine and Verdebout (2017).

Tests of uniformity under specified location
In this section, we consider the problem of testing uniformity over S p−1 against the class of alternatives introduced in the previous section, in a situation where the location θ θ θ is specified. In other words, this corresponds to cases where it is known in which direction the possible deviation from uniformity would materialize. Depending on the exact type of alternatives we want to focus on (bipolar, girdle-type, or both), we will then consider, for a fixed θ θ θ, the problem of testing H : θ θ θ,κ,f }. Optimal testing may be based on the following Local Asymptotic Normality (LAN) result.
Theorem 3.1. Fix p ∈ {2, 3, . . .}, θ θ θ ∈ S p−1 , and f ∈ F. Let κ n = τ n p/ √ n, where the real sequence (τ n ) is O(1) but not o(1). Then, letting we have that, as n → ∞ under P (n) 0 , where ∆ (n) θ θ θ is asymptotically normal with mean zero and variance Γ p . In other words, the sequence ({P This result confirms that 1/ √ n is the contiguity rate when testing uniformity against the considered axial alternatives. Note also that the central sequence rewrites where S n := n −1 n i=1 X ni X ni is the sample covariance matrix of the observations (with respect to a fixed location, namely the origin of R p ). Consequently, optimal testing of uniformity for axial data will be based on S n . This is to be compared with the non-axial case considered in Cutting, Paindaveine and Verdebout (2017), where optimal testing of uniformity is rather based onX n = n −1 n i=1 X ni . This will have important consequences when considering the unspecified-θ θ θ case we turn to in Sections 4-5.
More importantly, the optimal axial tests of uniformity in the specified location case directly result from the LAN property above. More precisely, Theorem 3.1 entails that, for the problem of testing H θ θ θ+ rejecting the null hypothesis at asymptotic level α whenever is locally asymptotically most powerful; here, z α = Φ −1 (1 − α) denotes the upper α-quantile of the standard normal distribution. A routine application of Le Cam's third lemma shows that, under P is asymptotically normal with mean Γ 1/2 p τ and variance one. Therefore, the corresponding asymptotic power of φ Note that this asymptotic power does not converge to α as p diverges to infinity. This may be surprising at first since departures from uniformity here are of a single-spiked nature, that is, only materialize in a single direction out of the p directions in S p−1 . The fact that this asymptotic power does not fade out for larger dimensions is actually explained by the fact that we did not consider local alternatives associated with κ n = τ / √ n but rather with κ n = τ p/ √ n, which properly scales local alternatives for different dimensions p. Optimal tests for the other one-sided problem and for the two-sided problem are obtained in a similar way. More precisely, for the problem of testing H θ θ θ− rejecting the null hypothesis of uniformity at asymptotic level α whenever T (n) θ θ θ < −z α is locally asymptotically most powerful and has asymptotic power lim θ θ θ± say, rejects the null hypothesis at asymptotic level α whenever |T (n) θ θ θ | > z α/2 . This test is locally asymptotically maximin for the two-sided problem and has asymptotic power θ θ θ,κn,f with κ n = τ p/ √ n (τ = 0). Again, the local asymptotic powers of these tests do not fade out for larger dimensions p but rather converge to a constant larger than α.
We conducted the following Monte Carlo exercise in order to check the validity of our asymptotic results. For any combination (n, p) of sample size n ∈ {100, 1000} and dimension p ∈ {3, 10}, we generated collections of 5000 independent random samples of size n from the Watson distribution with location θ θ θ = (1, 0, . . . , 0) ∈ R p and concentration κ n = τ p/ √ n, for τ = −2, −1, 0, 1, 2; see Section 2. The value τ = 0 corresponds to the null hypothesis of uniformity over S p−1 , whereas the larger the non-zero value of |τ | is, the more severe the alternative is. Kernel density estimates of the resulting values of the test statistic T (n) θ θ θ in (5) are provided in Figure 1, that further plots the densities of the corresponding asymptotic distributions (for the null case τ = 0, histograms of the values of T (n) θ θ θ are also shown). Clearly, our asymptotic results are confirmed by these simulations (yet, unsurprisingly, larger dimensions require larger sample sizes for asymptotic results to materialize).

The unspecified location case: the Bingham test
We now turn to the unspecified-θ θ θ version of the testing problems considered in the previous section. We focus first on the one-sided problem of testing H It is convenient to reparameterize the submodel associated with κ ≥ 0 by writing ϑ ϑ ϑ = √ κθ θ θ. In this new parametrization (which, unlike the original curved one, is flat), the testing problem writes H : ϑ ϑ ϑ = 0. Theorem 4.1 below then describes the asymptotic behavior of the corresponding local log-likelihood ratios.
To be able to state the result, we need to introduce the following notation. For a matrix A, we will write vec A for the vector obtained by stacking the columns of A on top of each other. We let J p := (vec I p )(vec I p ) , where I is the × identity matrix. Finally, with the usual Kronecker product ⊗, the p 2 × p 2 commutation matrix is K p := p i,j=1 (e i e j ) ⊗ (e j e i ), where e is the th vector of the canonical basis of R p . We then have the following result. where ∆ ∆ ∆ (n) is, still under P (n) 0 , asymptotically normal with mean vector zero and covariance matrix Γ Γ Γ p .
Theorem 4.1 shows that the contiguity rate for ϑ ϑ ϑ is n −1/4 , which corresponds to the contiguity rate n −1/2 obtained for κ in Theorem 3.1 (recall that ϑ ϑ ϑ = √ κθ θ θ); however, as we will explain below, the limiting experiment in Theorem 4.1 is non-standard. Writing A − for the Moore-Penrose generalized inverse of A, a natural test of uniformity is the test rejecting the null hypothesis at asymptotic level α whenever where we denoted as χ 2 dp,1−α the upper α-quantile of the chi-square distribution with d p := p(p + 1)/2 − 1 degrees of freedom. This test, which rejects the null hypothesis when the sample variance of the eigenvaluesλ n1 , . . . ,λ np of S n is too large, also addresses the problem of testing uniformity against the one-sided alternatives associated with κ < 0 or against the two-sided alternatives associated with κ = 0. This procedure, which is known as the Bingham (1974) test (hence will be denoted as φ Bing in the sequel), is often regarded as the simplest test of uniformity for axial data; see Section 10.7 in Mardia and Jupp (2000). When applied to the unit vectors X ni = Z ni / Z ni , i = 1, . . . , n, obtained from Euclidean data Z ni , i = 1, . . . , n, this test is also the sign test of sphericity from Hallin and Paindaveine (2006), and it follows from that paper that, as an axial test for uniformity on the sphere, the Bingham test is optimal against angular Gaussian alternatives (see Tyler, 1987), that is, against projections of elliptical distributions on the sphere.
Local asymptotic powers of the Bingham test can be obtained from the LAN result in Theorem 3.1 and Le Cam's third lemma. We have the following result.
This result in particular shows that the Bingham test is a two-sided procedure, as the asymptotic power in (10) exhibits a symmetric pattern with respect to girdle-type alternatives (τ < 0) and bipolar alternatives (τ > 0). This power, unlike the powers of the specified-θ θ θ tests in the previous section, converges to α as p diverges to infinity, which materializes the fact that, for larger dimensions, the Bingham test severely suffers (even asymptotically) from the unspecification of θ θ θ. Note also that since the Bingham test is invariant with respect to rotations, its limiting power naturally does not depend on the location parameter θ θ θ under the alternative.
We conducted the following simulation exercise to check the validity of the asymptotic results of this section. In dimension p = 3, we generated 5 000 mutually independent random samples of size n = 2 000 from the Watson distribution with location θ θ θ = (1, 0, . . . , 0) ∈ R p and concentration κ n = τ p/ √ n, for τ = −4, −3, 0, 3, 4; see Section 2. We did the same in dimension p = 10, with sample size n = 10000. For both dimensions p, Figure 2 reports kernel density estimates of the resulting values of the Bingham test statistic Q (n) . They perfectly match with the corresponding asymptotic distribution in (9). The results also confirm the two-sided nature of the Bingham test, that, irrespective of τ 0 , asymptotically behaves in the exact same way under τ = ±τ 0 .
Our results indicate that the Bingham test shows non-trivial asymptotic powers under the contiguous alternatives identified in the previous section. However, a key point is the following: if (τ τ τ n ) → τ τ τ in the LAN result of Theorem 4.1, then, under P (n) ϑ ϑ ϑn,f with ϑ ϑ ϑ n = (p/ √ n) 1/2 τ τ τ n , ∆ ∆ ∆ (n) is asymptotically normal with mean s τ τ τ = Γ Γ Γ p vec(τ τ ττ τ τ ) and covariance matrix Γ Γ Γ p , so that the sequence of asymptotic experiments at hand does converge to a Gaussian shift experiment (∆ ∆ ∆ ∼ N (s τ τ τ , Γ Γ Γ p )) involving a constrained shift s τ τ τ . As a result, the Bingham test is not Le Cam optimal for the considered problem: this test, which would rather be Le Cam optimal (more precisely, locally asymptotically maximin) for an unconstrained shift s ∈ R p 2 , is here "wasting" power against multi-spiked alternatives that are incompatible with the present single-spiked axial model. This is in line with the fact that the Bingham test, which rejects the null hypothesis of uniformity when the sample variance of the eigenvaluesλ n1 , . . . ,λ np of S n is too large, uses these eigenvalues in a permutation-invariant way (in the considered single-spiked models, it would be more natural to consider specificallyλ n1 and/orλ np to detect possible deviations from uniformity).

The unspecified location case: single-spiked tests
A natural question is then: how to construct a test that is more powerful than the Bingham test? We now describe two constructions that actually lead to the same test(s).
Focusing again at first on the one-sided problem involving the bipolar alternatives, we saw in Section 3 that, in the specified location case, Le Cam optimal tests of uniformity reject H In the unspecified location case, it is then natural, following Davies (1977Davies ( , 1987Davies ( , 2002, to consider the test, φ (n) + say, rejecting the null hypothesis of uniformity at asymptotic level α when T (n) whereλ n1 still denotes the largest eigenvalue of S n and c p,α,+ is such that this test has asymptotic size α under the null hypothesis. A similar rationale yields natural tests for the other one-sided problem and for the two-sided problem: since Le Cam optimal tests of uniformity reject H (n) 0 √ n p θ θ θ S n θ θ θ − 1 , the resulting unspecified-θ θ θ test, φ (n) − say, will reject the null hypothesis of uniformity at asymptotic level α when T (n) where c p,α,− is such that this test has asymptotic size α under the null hypothesis. Finally, since Le Cam optimal tests of uniformity reject H n p θ θ θ S n θ θ θ − 1 , the resulting unspecified-θ θ θ test, φ (n) ± say, will reject the null hypothesis of uniformity at asymptotic level α when where c p,α,± is still such that this test has asymptotic size α under the null hypothesis of uniformity.
Another rationale for considering the above tests is the following. For the sake of brevity, let us focus on the one-sided problem involving the bipolar alternatives, that is, the ones associated with κ > 0. A natural idea to obtain an unspecified-θ θ θ test is to replace θ θ θ in the corresponding optimal specified-θ θ θ test φ (n) θ θ θ+ with an estimatorθ θ θ n . Now, under P (see, e.g., Lemma B.3(i) in Cutting, Paindaveine and Verdebout, 2017), with It is easy to check that, for any f ∈ F, the function κ → g f (κ) is differentiable at 0, with derivative g f (0) = Var still denotes variance under P (n) 0 . Consequently, for κ > 0 small, we have g f (κ) > g f (0) = 1/p, so that θ θ θ is, up to an unimportant sign (recall that only the pair {±θ θ θ} is identifiable), the leading unit eigenvector of E[X n1 X n1 ] (for many functions f , including the Watson one f (z) = exp(z), this remains true for any κ > 0). Therefore, a moment estimator of θ θ θ is the leading eigenvectorθ θ θ n of S n = 1 n n i=1 X ni X ni . Note that in the Watson parametric submodel {P  Anderson and Stephens (1972), where the corresponding one-sided tests φ (n) + and φ (n) − were first proposed. We extend their result to the two-sided test statistic T (n) ± and, more importantly, to the non-null case. The key to do so is the following result.
A direct consequence of Theorem 5.1(i) is that simulations can be used to obtain arbitrarily precise estimates of the asymptotic critical values needed to implement the tests φ Interestingly, the following corollary shows that simulations can actually be avoided in dimensions p = 2 and p = 3, as the asymptotic null distribution of T has cumulative distribution function whereas the test statistic T 2 ) + 6Φ ( (here, Φ is the second derivative of the standard normal distribution function Φ).
Writing λ (A) for the th largest eigenvalue of the p × p matrix A and denoting as D = equality in distribution, Theorem 5.1 entails that, under the null hypothesis, This shows that, for any dimension p, the test statistics T (n) + and T (n) − share the same weak limit under the null hypothesis, which is confirmed in dimensions p = 2, 3 by Corollary 5.1. Maybe surprisingly, this corollary further implies that, for p = 2, the two-sided test statistic T (n) ± has the same asymptotic null distribution as T − . This can be explained as follows: since S n has trace one almost surely, its eigenvaluesλ n , = 1, . . . , p sum up to one almost surely (incidentally, this implies that the quantities √ n(pλ n − 1), = 1, . . . , p, do not admit a joint density, not even asymptotically so, which makes the proof of Theorem 5.1 rather challenging). For p = 2, it follows that T (n) almost surely, which of course entails that these three tests statistics share the same weak limit, not only under the null hypothesis but under any sequence of hypotheses. In line with this, L max 2,τ = −L min 2,τ = max(L max 2,τ , −L min p ) almost surely for any τ ∈ R. To check the validity of Theorem 5.1 and Corollary 5.1, we conducted the following numerical exercises in dimensions p = 3 and p = 10. We generated 5000 mutually independent random samples of size n = 2000 (for p = 3) and n = 10000 (for p = 10) from the Watson distribution with location θ θ θ = (1, 0, . . . , 0) ∈ R p and concentration κ n = τ p/ √ n, for τ = −4, −3, 0, 3, 4. ± , along with the densities of the corresponding asymptotic distributions; for p = 3 and τ = 0, these densities are those associated with the distribution functions in (15)-(16), whereas, in all other cases, they are kernel density estimates obtained from 10 6 independent realizations of L max p,τ , −L min p,τ , and max(L max p,τ , −L min p,τ ), respectively; see Theorem 5.1. Clearly, the results support our asymptotic findings. It is seen that the one-sided test φ (n) + not only shows power against the bipolar alternatives it is designed for (those associated with τ > 0) but also against girdle-type ones (those associated with τ < 0), which is actually desirable. The same can be said about the one-sided test φ (n) − , but each of these tests, of course, shows higher powers against the alternatives it was designed for. In contrast, the two-sided test φ (n) ± shows a symmetric power pattern for positive and negative values of τ .

Finite-sample comparisons
In the previous sections, we conducted Monte-Carlo exercises in order to check correctness of our null and non-null asymptotic results, but it is of course of primary importance to compare the power behaviors of the various tests considered in this work. In this section, we therefore study the finite-sample powers of the Bingham test φ ± ), and we compare them with those of the optimal specified-θ θ θ test φ (n) θ θ θ+ . Our asymptotic results further allow us to complement these finite-sample comparisons with comparisons of the corresponding asymptotic powers.
We conducted the following Monte Carlo experiment. For any combination (n, p) of sample size n ∈ {200, 20 000} and dimension p ∈ {3, 10}, we generated collections of 2000 independent random samples of size n from the Watson distribution on S p−1 with location θ θ θ = (1, 0, . . . , 0) ∈ R p and concentration κ n = τ p/ √ n, with τ = 0.8 , = 0, 1, . . . , 5. The value = 0 corresponds to the null hypothesis of uniformity, whereas = 1, . . . , 5 provide increasingly severe bipolar alternatives. In each sample, we performed three tests at asymptotic level α = 5%, namely the specified-θ θ θ test φ − ) (bottom), obtained from 5 000 independent random samples of size n = 2 000 from the Watson distribution with location θ θ θ = (1, 0, . . . , 0) ∈ R p and concentration κ = τ p/ √ n, with τ = −4, −3, 0, 3, 4 and with p = 3; for τ = 0, histograms of the corresponding test statistics are shown. The density of the corresponding asymptotic distributions are also plotted (dashed curves). (Right:) The corresponding results for p = 10 and n = 10 000. See Section 5 for details. Figure 4 shows the resulting empirical powers along with their theoretical asymptotic counterparts (for any given p and τ , the asymptotic power of φ (n) + was obtained from 10 000 independent copies of the random variable L max p,τ in Theorem 5.1). The results show that, as expected, the optimal specified-θ θ θ test outperforms both unspecified-θ θ θ tests. The test φ (n) + dominates the Bingham test φ (n) Bing and this dominance, quite intuitively, increases with the dimension p. Clearly, rejection frequencies agree very well with our asymptotic results for large sample sizes.

Final comments and perspectives for future research
Practitioners often face axial data, which explains that statistical procedures for such data are presented in most directional statistics monographs and have been the topic of numerous research papers; we refer to Fisher, Lewis and Embleton (1987), Mardia and Jupp (2000), Ley and Verdebout (2017), and to the references therein. However, the non-null properties of axial tests of hypotheses are barely known, compared to those of their non-axial counterparts. Since this is in particular the case for axial tests of uniformity, we systematically studied in this work the asymptotic power of several axial tests of uniformity. In particular, we derived the asymptotic powers of the Bingham and Anderson tests under contiguous rotationally symmetric alternatives. Our results identify the underlying contiguity rate and allow for theoretical power comparisons. Throughout, our asymptotic findings were confirmed through simulations. Far from being of academic interest only, our results may be useful to practitioners who, when rejection occurs, will get some insight on the underlying distribution by combining the outcomes of the various tests of uniformity (that is, they will get hints on the single-spiked vs multi-spiked nature of the distribution, on its bipolar vs girdle nature, etc.) The non-null asymptotic analysis conducted in this work essentially settles the lowdimensional case. Perspectives for future research therefore mainly relate to the highdimensional framework. It is rather straightforward to extend the contiguity and LAN results in Theorems 2.1-3.1 to the case where the dimension p = p n diverges to infinity with n at an arbitrary rate. However, in high dimensions, it is extremely challenging to derive the non-null asymptotic powers of the Bingham and Anderson tests under suitable local alternatives. For the Anderson tests, for instance, this is due to the fact that eigenvalues of sample covariance matrices suffer complicated phase transition phenomenons which, close to uniformity, results in a lack of consistency. These challenging questions are of course beyond the scope of the present work, hence are left for future research.
Letting t = |κ n | 1/2 s and using (17)-(18) then provides or, equivalently, where h n is defined through Since κ n is o(p n ), it can be checked that the sequence (h n ) is an approximate δ-sequence, in the sense that ∞ −∞ h n (t) dt = 1 for any n and ε −ε h n (t) dt → 1 for any ε > 0. Hence, which, by using L'Hôpital's rule, is equal to The result thus follows from (19).
The proof of Theorems 2.1-3.1 actually only requires the particular case of lemma A.1 corresponding to p n ≡ p. We still presented this more general version of the lemma as it allows one to extend Theorems 2.1-3.1 to high-dimensional asymptotic scenarios where p n would diverge to infinity with n (this would prove the claims in high dimensions provided in the last paragraph of Section 2).
Taking the limit as δ → 0, we obtain that almost surely for p = 2 (see the discussion below the corollary).
Taking the limit as δ → 0 shows that the density of ( 1 , 3 ) is which proves the result for T (n) + , hence also for T (n) − (recall from the discussion below the corollary that T (n) + and T (n) − share the same weak limit in any dimension p). Finally, the result for T (n) ± follows from (36) by using the fact that T (n) ± converges weakly to max( 3 ).