Independent factorization of the last zero arcsine law for Bessel processes with drift

We show that the last zero before time $t$ of a recurrent Bessel process with drift starting at $0$ has the same distribution as the product of an independent right censored exponential random variable and a beta random variable. This extends a recent result of Schulte-Geers and Stadje (2017) from Brownian motion with drift to recurrent Bessel processes with drift. Our proof is intuitive and direct while avoiding heavy computations. For this we develop a novel additive decomposition for the square of a Bessel process with drift that may be of independent interest.


Introduction
Let X = (X t : t ≥ 0) denote the coordinate process on the canonical space of continuous functions from [0, ∞) to R and let W x denote the law under which X is Brownian motion starting at x ∈ R. We write g t for the last zero of X before time t. More precisely, g t = sup{s ≤ t : X s = 0} with the usual convention of sup{∅} = −∞. The well-known arcsine law for g t is due to Lévy [Lév39,§ 7.7] and states that g t under W 0 where we used L = to indicate equality in law and A 1 2 is a Beta( 1 2 , 1 2 ) random variable with density P (A 1 2 ∈ dx) = 1 π x(1 − x) ½ (0,1) (x) dx . * Supported at the Technion by a Zuckerman Fellowship.
There have been many extensions of (1) to processes besides 1-dimensional Brownian motion, see Chapter 8 of [MY08] and Section 2.5 of [Pit18] as well as references therein for some examples. One such generalization that plays an important role in this paper is due to Lamperti [Lam62,Theorem 5.1] and identifies the distribution of g t when the underlying Brownian motion is replaced by a recurrent Bessel process of dimension δ ∈ (0, 2). We use P δ x to denote the law of this Bessel processes when it starts at x ≥ 0. Lamperti showed that where α = 1− δ 2 ∈ (0, 1) and A α is a Beta(α, 1−α) random variable with density Lévy's last zero arcsine law (1) can be seen as a special case of (2) since the law of |X| under W 0 is the same as the law of X under P 1 0 . Another more recent generalization that is central to our results is that of characterizing the law of g t when a constant drift µ ∈ R is added to the underlying Brownian motion. We use W µ x to denote the law of Brownian motion with constant drift µ when it starts at x ∈ R. Using a random walk approximation argument, Schulte-Geers and Stadje [SGS17, Theorem 2.1] show that g t under W µ 0 has the independent factorization where E µ is an Exp( 1 2 µ 2 ) random variable independent of A 1 2 and with density Despite the elegant and elementary nature of (3), it seems to have escaped notice until [SGS17], see also [IO20, Remark 2.1]. The present author first learned of this characterization from a MathOverflow answer [esg], presumably written by the first author of [SGS17]. Another attractive feature of (3) is that it allows us to easily recover Lévy's last zero arcsine law (1) since min{t, E µ } degenerates to t as µ → 0. Moreover, since the last exit time from 0 is almost surely finite when µ = 0, we can also recover the law of g ∞ in this case. Letting After recalling the beta-gamma algebra [RY99, Chapter 0.6], it follows that g ∞ under W µ 0 is a Gamma( 1 2 , 1 2 µ 2 ) random variable with density While verifying (3) directly using a Girsanov measure change argument is a straightforward matter, tedious calculations are required as can be witnessed in Section 2 of [IO20] where this is carried out in detail. Indeed, in [SGS17, Remark 2.3], the authors appeal for a "purely Brownian" explanation of the independent factorization (3). This leads us to the main contributions of this paper: 1. Extending (3) to Bessel processes of dimension δ ∈ (0, 2) with positive drift (in the sense of Watanabe [Wat75]), thereby unifying the last zero arcsine laws for Brownian motion (1) and recurrent Bessel processes (2) under the independent factorization framework of (3) when drift is present.
2. Giving a "purely Bessel" explanation for the independent factorization (3) and the aforementioned extension to Bessel processes with drift that is intuitive, direct, and avoids heavy computation. For this we develop an additive decomposition for the square of a Bessel process with drift which appears to be new and may be of independent interest.
The remainder of the paper is organized as follows. In Section 2 we recall the definition of Bessel processes with drift and state our main theorems. In Section 3, we review several relevant properties of Bessel processes and bridges and prove some preliminary results. Finally, the main theorems are proved in Section 4 where we also outline an alternative proof in Section 4.2 and discuss how it compares to the proof we give.
When µ > 0, these processes are determined by the generator with 0 being a regular boundary with instantaneous reflection if 0 < δ < 2 or an entrance boundary if δ ≥ 2. When µ = 0, they coincide with the usual Bessel processes having the same dimension. Accordingly, we use P δ,µ x to denote the law of a Bessel process with dimension δ and drift µ that starts from x ≥ 0. By writing P δ x instead of P δ,0 x , this notation subsumes that of Bessel processes without drift from Section 1. We will also work with squared Bessel processes with drift and will use Q δ,µ x to denote the law of these processes. More precisely, The appearance of the logarithmic derivative of h δ,µ in the first-order term of (4) along with the fact that L δ,0 h δ,µ = 1 2 µ 2 h δ,µ implies that a Bessel process of dimension δ with drift µ is simply a Bessel process of the same dimension without drift killed at rate 1 2 µ 2 and then h-transformed by h δ,µ . In particular, for any fixed t ≥ 0, this gives the absolute continuity relation where we used (F t : t ≥ 0) to denote the canonical filtration. Refer to [Pin95, Section 4.1] for the requisite theory on h-transforms of diffusion processes.
As remarked upon in [PY81], the name Bessel process with drift is appropriate since for δ ∈ N, the law of X under P δ,µ 0 is the same as that of the modulus of Brownian motion in R δ starting at 0 with a constant drift vector of magnitude µ, see [RP81, Theorem 3]. Moreover, by the corollary to [Wat75, Theorem 2.1], we have for any δ > 0 and x, µ ≥ 0 Interest in these processes is motivated in part by their being, up to a scale factor, the only regular and conservative diffusions on [0, ∞) that satisfy the time inversion property [Wat75, Law08], see Section 3.1. Additionally, when δ = 3 they make an appearance in Williams' path decomposition for Brownian motion with drift [Wil74,PSŻ15]. When δ = 3 and µ = 1, they also coincide with the hyperbolic Bessel process of dimension 3, see [JW13,AGŻ15]. We note that they are distinct from the Bessel processes with constant "naive drift" which arise in studies of bird navigation and queueing theory, see [Yor84,Lin04] and references therein.

Main theorems
Our first result is an additive decomposition of (X t : t ≥ 0) under Q δ,µ 0 into two independent processes that start at 0: one being a squared Bessel process of dimension δ without drift and the other being a squared Bessel process of dimension 4 with drift µ which waits for an independent Exp( 1 2 µ 2 ) time before starting. Such a decomposition was alluded to by Pitman and Yor in their Remark 5.8.iii of [PY82] but as far as we know, no explicit statement has ever appeared in the literature. A random waiting time before starting has featured in a similar additive decomposition for a squared Bessel process of dimension 2 without drift that appears in Section 3.5.1 of [MY08]. We adopt Mansuy and Yor's notation which makes use of the positive part x + := max{0, x}.
Remark 1. The piecewise nature of the second summand distinguishes the additivity exhibited in Theorem 1 from the usual kind described in Section 3.4. A similarly exotic form of additivity for squared Bessel processes without drift and with possibly negative dimensions can be found in [PW18, Proposition 1.1].
Theorem 1 allows us to give a quick and intuitive proof of the independent factorization of the last zero arcsine law (3) and its generalization to Bessel processes with drift which we state below as Theorem 2.
Theorem 2. Let g t = sup{s ≤ t : X s = 0} be the last zero before time t of a Bessel process with dimension δ ∈ (0, 2) and drift µ > 0 starting at 0. Put α = 1 − δ 2 and let A α and E µ be independent Beta(α, 1 − α) and Exp( 1 2 µ 2 ) random variables, respectively. Then we have the independent factorization Remark 2. As mentioned before, |X| under W µ 0 has the same law as X under P 1,|µ| 0 so it follows that (3) is a special case of Theorem 2.
Remark 3. We can recover Lamperti's arcsine law (2) by letting µ → 0. Similarly, letting t → ∞ and appealing to the beta-gamma algebra shows that g ∞ under P δ,µ 0 is a Gamma(α, 1 2 µ 2 ) random variable, cf. [PY81, Section 7]. Theorem 2 has a dual formulation in terms of the first zero after a fixed time which we state below as Theorem 3. We prove this directly and also show in Section 3.1 that either Theorem 2 or Theorem 3 can be deduced from the other using the time inversion property.
Theorem 3. Let d t = inf{s ≥ t : X s = 0} be the first zero after time t of a Bessel process with dimension δ ∈ (0, 2) and no drift starting at x > 0. Put α = 1 − δ 2 and let A α and E x be independent Beta(α, 1 − α) and Exp( 1 2 x 2 ) random variables, respectively. Then we have the independent factorization Remark 4. By letting either x → 0 or t → 0, we see that d t under P δ 0 is a shifted and scaled beta prime random variable while τ 0 under P δ x is an inverse gamma random variable.

Time inversion property and duality
Our proofs rely on some well-known properties of Bessel processes and Bessel bridges which we now recall, starting with the aforementioned time inversion property [Wat75, Theorem 2.1]. For δ > 0 and x, µ ≥ 0, this states that In other words, time inversion preserves dimension but swaps the drift and starting position of Bessel processes with drift. Since Bessel processes without drift satisfy the usual Brownian scaling property, we get from (8) and (7) that for any δ, c > 0 The time inversion property (8) can also be used to establish a duality relation between g t and d t . More precisely, for any t > 0 and x ≥ 0 with δ ∈ (0, 2) we have Using (10), it is easy to deduce Theorem 3 from Theorem 2 and vice versa.

Bessel bridges with δ > 0
Next we introduce the notation P δ,T x→y for the law of a Bessel process with dimension δ > 0 which starts at x ≥ 0 and is conditioned to be at y ≥ 0 at time T > 0, that is, the law of a Bessel bridge with dimension δ > 0 from x to y of length T . While the appearance of an arrow in the notation for bridge laws should prevent mistaking the T for drift, we also use B for the coordinate process of a bridge instead of X to further the distinction.
The following lemma is a consequence of the time inversion property (8) and shows how Bessel processes with drift and Bessel bridges are related to each other through a space-time transformation.
Lemma 1. Suppose δ, T > 0 and x, µ ≥ 0. Then we have: ii. X t : t ≥ 0; P δ,µ Proof. We prove part i. by starting from Theorem 5.8 in [PY81] and using (9) Part ii. can be deduced from part i. via

Bessel bridges with δ = 0
It is well known that 0 is an absorbing state for the Bessel process of dimension 0 and this will have implications for the corresponding bridges. Before defining Bessel bridges in this case, we first recall the particularly simple distribution function of the absorption time. More generally, we let τ y = inf{t > 0 : X t = y} denote the first hitting time of y ∈ R by the coordinate process. Then from [RY99, Corollary XI.1.4] we have From this it follows that τ 0 under P 0 where E x is an Exp( 1 2 x 2 ) random variable. Now we expound on the subtlety in the definition of Bessel bridges of dimension 0 that stems from 0 being an absorbing state for the underlying unconditioned process, see also [PY82,Section 5.3]. When x > 0, the bridge law P 0,T x→0 results from conditioning a 0-dimensional Bessel path of duration T starting at x to be absorbed before time T . Note that this conditioned absorption time will almost surely occur strictly between 0 and T . The bridge law P 0,T 0→x is simply the law of the time reversed bridge under P 0,T x→0 , namely When both x, y > 0, the bridge law P 0,T x→y is defined just as in the δ > 0 case. When x = y = 0, the law P 0,T 0→0 is degenerate and assigns probability 1 to the constant 0 path. By also conditioning on the exact time of absorption, we can relate a Bessel bridge of dimension 0 with a bridge of dimension 4. More precisely, if x > 0 and 0 < S ≤ T , then we have From this it follows by time reversal that Lemma 1 does not apply when δ = 0 so we need another result for this case. The following lemma serves this purpose by using the same space-time transformation to connect a Bessel bridge of dimension 0 with the waiting Bessel process of dimension 4 which appears in Theorem 1.
Lemma 2. For µ > 0, let E µ be an independent Exp( 1 2 µ 2 ) random variable. Then ( Proof. It follows from (13) that the bridge (B t : 0 ≤ t ≤ 1) under P 0,1 0→µ can be split into two independent pieces by conditioning on g 1 . Indeed, we can sample this bridge by first drawing g 1 under P 0,1 0→µ and then sampling a bridge under P 4,1−g1 0→µ that is appended to a constant 0 path of length g 1 . More precisely, let γ be an independent random variable distributed like g 1 under P 0,1 0→µ . Then .
Applying the applicable space-time transformation to both sides of this equality in law results in Since 0 < γ < 1 almost surely, notice that and similar calculations show that Writing f (t) for the (t − γ 1−γ ) + which appears in both of these identities, now we can make the appropriate substitutions in the right-hand side of (14) and then appeal to part ii. of Lemma 1 to yield It remains to identify the distribution of the γ 1−γ appearing in (15). By hypothesis, this random variable is distributed like g1 1−g1 under P 0,1 0→µ and is also independent of the Bessel process that appears in (15). From (11), it follows that g 1 under P 0,1 0→µ has distribution function Hence From this we deduce that the γ 1−γ appearing in (15) is an independent Exp( 1 2 µ 2 ) random variable and the proof is complete.

Additivity property
Lastly, we recall the additivity property of squared Bessel processes [RY99, Theorem XI.1.2]. This property states that if X and X ′ are independent squared Bessel processes of dimensions δ, δ ′ ≥ 0 starting from x, x ′ ≥ 0, then their sum (X t + X ′ t : t ≥ 0) is a squared Bessel process of dimension δ + δ ′ starting from x + x ′ . A more succinct statement of the additivity property is where we used Q δ x * Q δ ′ x ′ to denote the law of the sum of independent processes with laws Q δ x and Q δ ′ x ′ . The additivity property also applies to squared Bessel bridges and for δ, δ ′ ≥ 0 and x, x ′ ≥ 0 we have the statements The analogous result for bridges with general starting and ending points is more complicated, see [PY82,Theorem 5.8].
4 Proofs of the main theorems 4.1 Proof of Theorem 1 Proof of Theorem 1. We apply part ii. of Lemma 1, the additivity property for bridges (17), then part ii. of Lemma 1 and Lemma 2 to write In light of (5), this proves the theorem.

Proof of Theorem 2
Arguably, the most direct way to analyze the distribution of g t under P δ,µ 0 is via the absolute continuity relation (6) using the joint law of g t and X t under P δ 0 . Deriving this joint law is relatively straightforward. The g t marginal is already known from Lamperti's arcsine law (2) and the conditional law of X t can be deduced using the bridge-meander path decomposition whereby conditioning on g t , the path (X s : 0 ≤ s ≤ t) under P δ 0 splits into a δ-dimensional Bessel bridge of length g t and an independent δ-dimensional Bessel meander of length t − g t . An explicit joint density follows from the Imhof relation for Bessel meanders [MY08, Section 3.6] and the transition density for Bessel processes [RY99, Section XI.1]. Now the joint density of g t and X t under P δ,µ 0 can be read off from (6). However, one major drawback of using the above outlined approach is that onerous computations involving modified Bessel functions are necessary in order to compute the marginal density of g t under P δ,µ 0 and prove that it coincides with that of min{t, E µ }A α . Indeed, even in the Brownian motion case where Bessel functions are absent, tedious calculations are required to verify (3) via this approach, see [IO20, Section 2]. In fact, it might even be worthwhile to reverse this line of reasoning and see if Theorem 2 leads to any new integral identities for modified Bessel functions, though we don't pursue this question here. By contrast, the method of proof we employ below completely avoids the computation issue by appealing to Theorem 1 instead.
We can use the independence of E µ and X along with Bessel scaling to factor out the min{t, E µ } from inside the sup appearing in (18). Together with the fact that the zeros of a process and its square are the same, this allows us to conclude that g t under P δ,µ 0 L = min{t, E µ } sup{s ≤ 1 : X s = 0} under Q δ 0 L = min{t, E µ } g 1 under P δ 0 . Now the desired result follows from Lamperti's arcsine law (2).

Proof of Theorem 3
As mentioned in Section 3.1, Theorem 3 follows from a combination of Theorem 2 and the duality relation (10). However, here we opt for a more direct proof that uses the additivity property for squared Bessel processes without drift from Section 3.4.
Proof of Theorem 3. Since 0 is an absorbing state for X ′ under Q δ x 2 , we can use the additivity property (16) to write inf{s ≥ t : X s = 0} under Q δ The independence of τ ′ 0 and X allows us to apply Bessel scaling to (19), thereby factoring out the max{t, τ ′ 0 } from inside the inf. In conjunction with (5) and (12), this implies that Now we can use (10) and (2) to rewrite (20) as which completes the proof.