A Renewal version of the Sanov theorem

Large deviations for the local time of a process $X_t$ are investigated, where $X_t=x_i$ for $t \in [S_{i-1},S_i[$ and $(x_j)$ are i.i.d.\ random variables on a Polish space, $S_j$ is the $j$-th arrival time of a renewal process depending on $(x_j)$. No moment conditions are assumed on the arrival times of the renewal process.


Main results
1.1.Outline of the result.Consider an i.i.d.sequence (x i ) i∈N + in a Polish space X , with marginal distribution μ.One may define a stochastic process (X t ) t≥0 on X by setting X t = x i for t ∈ [i − 1, i[, and consider its empirical measure π t := [0,t[ ds δ Xs .The ergodic theorem then states that π t → μ as t → +∞, while the Sanov theorem yields a finer estimate for the probability that π t is found in a small neighbor of a given Borel probability measure ν on X .Such probability is estimated, in the sense of large deviations, as exp(−tH(ν|μ)), where H(ν|μ) is the relative entropy of ν with respect to μ.
In this paper, we want to provide a similar result, in the case in which the time spent by the process X t at the point x i may depend on the process itself.In particular, for τ : X → [0, +∞] a measurable map, define N t := inf{n ∈ N + : n+1 i=1 τ (x i ) ≥ n}, and X t := x Nt+1 .In the next section, the precise mathematical setting for the study of the large deviations of the empirical measure of X t is recalled, and a large deviations result is established in Section 1.4.While for τ ≡ 1 one gets the classical Sanov theorem, we are mainly interested in the case where the law of τ features heavy tails.In such a case the Markov process (X t , t − Nt i=1 τ (x i )) has not good ergodic properties, and the classical Donsker-Varadhan theorem is violated.1.2.Mathematical setting.In the following N = {0, 1, . ..},N + = N \ {0}; X is a Polish space, that is a separable, completely metrisable topological space; a general element of X N + will be denoted x = (x 1 , x 2 , . ..);C b (X ) and C c (X ) are respectively the spaces of bounded continuous functions and compactly supported continuous functions on X .M 1 (X ) is the space positive Radon measure on X with total variation bounded by 1, while P(X ) ⊂ M 1 (X ) is the set of Borel probability measures on X .For µ ∈ M 1 (X ) and f a µintegrable function, we understand µ(f ) = f dµ.For µ, ν ∈ P(X ), H(ν|µ) denotes the relative entropy of ν with respect to µ: We always consider P(X) equipped with the narrow topology, namely the weakest topology such that µ → µ(f ) is continuous for all f ∈ C b (X ).In the particular case in which X is locally compact, we will also regard M 1 (X ) as a topological space, equipped with the vague topology, namely the weakest topology such that µ → µ(f ) is continuous for all f ∈ C c (X ).P(X) is then a Polish space, and if X is locally compact M 1 (X ) is a compact Polish space.
Fix a reference probability μ ∈ P(X ) and a measurable function τ : X → [0, +∞]; τ (x) has to be interpreted as the time elapsed at x. μ and τ are the only 'inputs' of the problem.
Define ξ : X → [0, +∞] and ξ ∞ ∈ [0, +∞] as where The role of the auxiliary function ξ and of the assumptions below are discussed at the end of this section.In particular it is remarked that (A2) below is implied by regularity assumptions on τ (e.g.upper semicontinuity at infinity).Hereafter (A1) and (A2) will always be assumed, while our main results are proved whenever at least one of (A3) or (A4) holds (with somehow different statements in the two cases).
In the following x is sampled as an i.i.d.sequence with marginal law μ and E will denote the expectation of functions of x with respect to μ⊗N + .By (A1), for each n ∈ N, t ≥ 0 and a.e.x, the following random variables are well defined In other words, [ and so on, while π t : X N + → P(X ) is the local time or the empirical measure of X t .Let P t := μ⊗N + • π −1 t be the law of π t .From the ergodic theorem, one expects π t to concentrate on a deterministic limit as t → +∞ (this is easily established, for instance, whenever μ(τ ) < +∞).Large deviations of P t are then relevant, and subject of investigation of this paper.ξ ∞ = sup K⊂⊂Y inf y∈K c ξ(y, +∞) (c) As a special case of (b), take X := [0, +∞[×[0, +∞] and for dμ((r, s)) = dν(r)φ(s), where ν is any probability measure on ]0, +∞[ and φ is the exponential law with mean 1. Set τ ((y, s)) = θ(y)s, so that, conditionally on y, τ is an exponential random variable with mean θ(y).In this setting, N t is an inhomogeneous Poisson random process, and the empirical measure π t keeps track of the rates of the interarrival times.In this case ξ(y, t) = +∞ for t < +∞ or y ∈ Supp(ν), while ξ(y, +∞) = 1/θ(y) for y ∈ Supp(ν), and ξ ∞ = lim y→+∞ ξ(y, +∞).(d) An interesting example in which τ is 'truly' deterministic is the following.
x i .This is a model for a particle moving on 1-dimensional torus of length 1.During its motion the particle touches some fixed hot points equi-spaced on the torus, and it changes its speed by sampling a new one with law μi at the hot point i. τ (x) is then the time elapsed to complete a tour of the torus.
One can derive the large deviations of some physical quantities (e.g.kinetic energy of the particle) from the large deviations of the empirical measure of X t .The physically relevant case is μi (x i ) = x i e −β i x 2 i dx i for some β i > 0. Then ξ ∞ = +∞ and ξ(x) = +∞ unless one the x i is 0, in which case ξ(x) = 0.As remarked below, when {ξ = 0} is non-empty, the large deviations rate functional is not strictly convex.For n = 1, this moving particle dynamics has been used as a building block of a toy model of out-of-equilibrium statistical mechanics in [6], where the absence of strict convexity of the rate causes a dynamic phase transition in the model.
Theorem 1.3.If (A3) holds, then (P t ) t>0 satisfies a good large deviations principle on P(X ) with rate I.
In the following remark some features of the functional I are investigated.In particular we characterise the cases where I is strictly convex and those in which it features affine stretches.
If ξ ∞ < +∞, (P t ) t>0 is not exponentially tight on P(X ), and large deviations need to be investigated on M 1 (X ).However, in this case we need X to be locally compact in order to have good topological properties of M 1 (X ).Proposition 1.5.Define I ′ : M 1 (X ) → [0, +∞] as If (A4) holds, then I ′ is a good and convex functional on M 1 (X ).
Theorem 1.6.If (A4) holds, then (P t ) t>0 satisfies a good large deviations principle on M 1 (X ) with rate I ′ .Under (A1), the key assumption (A2) is satisfied whenever In particular (A2) holds if τ is upper semicontinuous at infinity.Since all the results stated above make sense even dropping (A2), one may wonder whether it is a merely technical condition.While one can prove the large deviations upper bound even dropping this assumption, the lower bound is in general false if (A2) does not hold.
1.5.Outlook.With the same notation as above, one may also introduce the ) ) ∈ X × [0, 1[.Large deviations for the empirical measure of Y t would give large deviations of X t by a standard contraction argument.Moreover, the Donsker-Varadhan theory [3] and its extensions provide general large deviations results for the empirical measure of a Markov process.However, this approach fails in this case.On the one hand, standard Donsker-Varadhan theorems cannot be applied here, since Y t only enjoys weak ergodic properties.On the other hand, even formally, the Donsker-Varadhan rate functional does not provide the right answer, a feature already remarked in [5] for renewal processes.Indeed, it has been proved in [7] that in general the empirical measure of Y t does not satisfy a large deviations principle, and in the special case it does (which depends on the law of τ under μ), the rate functional does not correspond to the Donsker-Varadhan functional.Similarly, the large deviations rate functional for π t does not correspond in general to the one predicted by applying contraction to the Donsker-Varadhan functional for the empirical measure of Y t (unless τ has all exponential moments bounded).In this respect, it may be remarkable that the law of π t satisfies a large deviations principle at all.

The functional I
This section is devoted to prove Proposition 1.2, Proposition 1.5 and general properties of the functional I, which will play a key role in the proof of the main theorems.First we remark that one can reduce to the case of a compact state space X .Proposition 2.1.Suppose that Proposition 1.2 and Theorem 1.3 hold with the additional hypotheses of X being a compact Polish space.Then Proposition 1.2, Theorem 1.3, Proposition 1.5 and Theorem 1.6 hold.
Proof.An arbitrary Polish space X embeds continuously in [0, 1] N .Regard X as a subset of [0, 1] N and let Y be the closure of X .Then Y is compact.Extend μ to Y setting μ(Y \ X ) = 0 and extend τ to Y setting τ (x) = +∞ for x ∈ Y \ X .We denote ξ Y and I Y the object corresponding to ξ and I on Y. Then (A1), (A2) and (A3) hold on Y. Thus, by the hypotheses of this proposition, the extension of P t to P(Y) satisfies a large deviations principle with good rate I Y .Now, if (A3) holds on X , then ξ Y (x) = +∞ for x ∈ Y \ X.Thus the map Π : P(Y) → P(X ) defined as where we also identified f with its unique continuous extension on Y (namely f (x) = 0 for x ∈ Y \ X ).Then Π ′ is continuous, and we conclude again by contraction principle.
Motivated by the previous remark, hereafter we assume X to be compact, with no loss of generality.
The following identity follows immediately from (1.2).
Now (2.6) states in particular that I is the supremum of a family of linear lower semicontinuous mappings, thus Proposition 1.2 follows.
Proof.Since J ≥ I and I is lower semicontinuous, the lower semicontinuous envelope of J is greater than I. Therefore it is enough to show that for each ν ∈ P(X ) such that I(ν) < +∞, there exists a sequence ν n → ν such that lim n J(ν n ) ≤ I(ν).
(2.10) Indeed ν s (ξ) ≤ I(ν) < +∞, thus ν s is concentrated on {ξ < +∞}.Since ν s (∂A δ j ) = 0, there exists a point x δ j in the interior of A δ j such that ξ(x δ j ) < +∞.Then, for each c > ξ(x δ j ) and ε > 0 lim Hence for M large enough {L ≤ τ ≤ M } has positive μ-measure in each neighbourhood of x δ j , including A δ j .The claim (2.10) is thus proved.By (2.10), for each L = (L 1 , L 2 , . ..) ∈ [0, +∞[ N there exists M L ∈ [0, +∞[ N , such that the probability measure is well defined whenever M ≥ M L , provided the terms in the summation are understood to vanish whenever ν s (A δ i ) does.It follows straightforwardly from this definition that for each ϕ ∈ C b (X ) (2.12) Note that for each δ > 0 and L, M ∈ [0, +∞[ N with M ≥ M L , ν δ,L,M is absolutely continuous with respect to μ.By the convexity of I proved in Proposition 2.4 where the corresponding terms above are understood to vanish whenever ν a (X ) or ν s (A δ i ) do.By direct computation Thus, from Remark 2.5 lim Together with (2.13) this implies sup Combining this with (2.12), by a standard diagonal argument, there exists a sequence ν n = ν δ n ,L n ,M n converging to ν such that lim n I(ν n ) ≤ I(ν).
Proof.For r > 0 let h(r) = r(log r − 1) + 1, and let F F ⊂ F t be the σ-algebra generated by F .Then for Ω µ -a.e.x dQ F dP F (F (x)) = Therefore changing variables in the integration and using the convexity of h For n ∈ N, and x such that N t (x) = n one has Ω µ dΩ ν dΩ µ F t (x) = n+1 j=1 dν j dµ j (x j ) and thus The event {N t (x) ≥ j − 1} only depends on (x 1 , . . ., x j−1 ).Therefore the last integral in the above formula splits into a product as dν i (x i ) ½ Nt(x)≥j−1 X dν j (x j ) log dν j dµ j (x j ) which is easily rewritten as in the statement.
0 and I ≡ +∞.• (P t ) t>0 satisfies a large deviations upper bound with good rate I if lim t→+∞ 1 t log P t (C) ≤ − inf u∈C I(u) for all C ⊂ Y closed.• (P t ) t>0 satisfies a large deviations lower bound with good rate I, if lim t→+∞ 1 t log P t (O) ≥ − inf u∈O I(u) for all O ⊂ Y open.(P t ) t>0 is said to satisfy a good large deviations principle if both the upper and lower bounds hold with the same good rate I.
otherwise is continuous on the domain of I Y .Since Π is just the restriction map for probabilities concentrated on X , the extension of P t to P(Y) is mapped to P t by Π.Then by contraction principle [2, Chapter 4.2], I is good and P t satisfies a good large deviations principle on P(X ) with rate I.It is immediate to check that Π preserves the convexity, so I is convex.