Central Limit Theorem for truncated heavy tailed Banach valued random vectors

In this paper the question of the extent to which truncated heavy tailed random vectors, taking values in a Banach space, retain the characteristic features of heavy tailed random vectors, is answered from the point of view of the central limit theorem.


Introduction
Situations where heavy-tailed distributions is a good fit, and at the same time there is a physical upper bound on the quantity of interest, are common in nature. Clearly, the natural model for phenomena like this is a truncated heavy-tailed distribution -a distribution that matches a heavy-tailed one till a specified limit and after that it decays significantly faster or simply vanishes. This leads to the general question: when can the upper bound be considered to be large enough so that the effect of truncating by that is negligible? The first attempt at answering this question, in finite dimensional spaces, was made in Chakrabarty and Samorodnitsky (2009). In the current paper, the investigation started by that paper has been continued to achieve similar results in Banach spaces.
Suppose that B is a separable Banach space and that H, H 1 , H 2 , . . . are Bvalued random variables in the domain of attraction of an α-stable random variable V with 0 < α < 2. This means that there are sequences a n and b n so that as n −→ ∞, We assume that the truncating threshold goes to infinity along with the sample size, and hence we essentially have a sequence of models. We denote both -the sample size and the number of the model by n, and the truncating threshold in the n-th model by M n . The nth row of the triangular array will consist of observations X nj , j = 1, . . . , n, which are assumed to be generated according to the following mechanism: (1.2) X nj := H j 1 ( H j ≤ M n ) + H j H j (M n + L j )1 ( H j > M n ) , j = 1, . . . , n, n = 1, 2, . . .. Here (L, L 1 , L 2 , . . .) is a sequence of i.i.d. nonnegative random variables independent of (H, H 1 , H 2 , . . .). For each n = 1, 2, . . . we view the observation X nj , j = 1, . . . , n as having power tails that are truncated at level M n . The random variable L can be thought of as to model that outside the ball of radius M n , the tail "decays significantly faster or simply vanishes". L is assumed to have finite second moment. In Chakrabarty and Samorodnitsky (2009)  It was shown in that paper that as far as the central limit behavior of the row sum is concerned, observations with softly truncated tails behave like heavy tailed random variables, while observations with hard truncated tails behave like light tailed random variables. In Theorem 2.1, the main result of this paper, we show that the result under hard truncation can be extended to Banach spaces, if the "small ball criterion" holds. Doing this is not straightforward because of the following reason. While in finite-dimensional spaces, convergence in law is equivalent to one-dimensional convergence of each linear functional in law to the linear functional evaluated at the limit, the same is not true in Banach spaces. In the latter spaces, one needs to check in addition some tightness conditions; see for example, Ledoux and Talagrand (1991) or Araujo and Giné (1980) for details. Section 2 contains the results and their proofs. A couple of examples are studied in Section 3 -one where the hypothesis of Theorem 2.1 can be checked, and the other where the claim of that result does not hold. The examples serve the purpose of showing that there is a need for such a result, and that the result has some practical value.

A Central Limit Theorem for truncated heavy-tailed random variables
The triangular array {X nj : 1 ≤ j ≤ n} is as defined in (1.2). We would like to know if the row sums S n , defined by still converge in law after appropriate centering and scaling. Exactly same arguments as those in the proof of Theorem 2.1 in Chakrabarty and Samorodnitsky (2009) show that if the truncated heavy-tailed model is in the soft truncation regime as defined in (1.3), then b −1 n (S n − a n ) =⇒ V . In other words, from the point of view of central limit behavior of the partial sums, the truncated heavy-tailed model retains much of the heavytailedness. Hence, we shall assume throughout that the model is in the hard truncation regime, i.e., As mentioned earlier, easy-to-check criteria for satisfying the Central Limit Theorem on Banach spaces are not known. An example of the notso-easy-to-check ones is Theorem 10.13, page 289 in Ledoux and Talagrand (1991), known as the "small ball criterion". The main result of this paper, Theorem 2.1, is an analogue of this theorem in the truncated setting under hard truncation. But before stating that, we need the following preliminary. It is known that (1.1) implies that there is a probability measure σ on weakly on S; see Corollary 6.20(b), page 151 in Araujo and Giné (1980). (1) (small ball criterion) For every ǫ > 0 lim inf In that case, the characteristic function of γ is given by Here, B ′ is the dual of B, the space of linear functionals on B.
For the proof, we shall need the following one-dimensional lemma, which follows by exactly similar arguments as those in Theorem 2.2 of Chakrabarty and Samorodnitsky (2009), and hence we omit the proof.
Proof of Theorem 2.1. First we prove the direct part, i.e., we assume that 1. and 2. hold. We first show that it suffices to check that {L(Z n )} is relatively compact where and for every n, X ′ n1 , X ′ n2 , . . . are i.i.d. copies of X n1 so that (X ′ nj : j ≥ 1) and (X nj : j ≥ 1) are independent families. To see this, suppose that we have shown that {L(Z n )} is relatively compact. By Corollary 4.11, page 27 in Araujo and Giné (1980), it follows that the sequence {L(B −1 n S n )} is relatively shift compact, i.e., there exists some sequence {v n } ⊂ B such that By (2.2), it follows that B n ≫ M n . Thus, for fixed t > 0 and n large enough, This shows (2.6) and hence that {L[B −1 n (S n − ES n )]} is relatively compact. In view of Lemma 2.1, this will complete the proof of the direct part.
First we record some properties of the random variables defined above, which shall be used in the proof. The hypotheses immediately imply that for all ǫ > 0 (2.9) lim inf n→∞ P ( Z n < ǫ) > 0 and that Let {F k } be any sequence of increasing finite-dimensional subspaces so that For any subspace F of B, denote by T F the canonical map from B to the quotient space B/F . By Corollary 6.19 (page 151) in Araujo and Giné (1980), it follows that for every k, T F k (H) is in the domain of attraction of some α-stable law with the same scaling constant (b n ) as that of H, and that It follows by (2.12) that lim k→∞ C k = 0. Note that, By the Karamata theorem (Theorem B.1.5, page 363 in de Haan and Ferreira (2006)), By (2.3), it follows that as n −→ ∞, That (2.11) holds and the fact that σ is a finite measure implies that Thus, in view of the assumption that EL 2 < ∞, it follows that which in turn implies that (2.14) lim Coming to the proof, in view of the criterion for relative compactness discussed in Ledoux and Talagrand (1991) (page 40-41), it suffices to show that given ǫ > 0, there is a finite dimensional subspace F with Let ε 1 , ε 2 , . . . be an i.i.d. sequence of Rademacher random variables, independent of (X n , X ′ n , n ≥ 1), and let E ε denote the conditional expectation given {Y nj }. It suffices to show that for all η > 0, and that there is a numerical constant C > 0 so that for every δ > 0, whenever {F k } is an increasing sequence of finite-dimensional subspaces satisfying (2.11).
To establish (2.16), it suffices to check that β > 0 is to be specified later. This is because for n large enough, Clearly, the equivalence in the second line following from Karamata's theorem. This shows that and hence, showing (2.18) suffices for (2.16). Let By Theorem 4.7 in Ledoux and Talagrand (1991) on concentration of Rademacher processes, with the median replaced by the expected value, as in page 292 of the same reference, it follows that Thus all that needs to be shown is that given any δ > 0, there is a choice of β depending only on δ, so that lim sup k→∞ lim sup n→∞ Eσ 2 n,F k ≤ δ .
Using Lemma 6.6 (page 154) in Ledoux and Talagrand (1991), it follows that for any n, F , Clearly, which can be made as small as needed by (2.14). For the other part, note that by the contraction principle (Theorem 4.4 in Ledoux and Talagrand (1991)), Thus, choosing β smaller than δ/(16 sup n≥1 E Z n ) (which is positive because of (2.10) ) does the trick. For the proof of (2.17) we shall show that there is an universal constant C > 0 so that whenever F is a subspace satisfying The reason that this suffices is the following. Fix δ > 0 and a sequence of increasing finite-dimensional subspaces {F k } satisfying (2.11). Note that for all n, k ≥ 1, By (2.16) and (2.9), it follows that By (2.20), (2.17) follows. The proof of (2.20) uses an isoperimetric inequality; see Theorem 1.4 (page 26) in Ledoux and Talagrand (1991). Let In light of the isoperimetric inequality, by similar arguments as in page 291 of Ledoux and Talagrand (1991), it follows that for k, q ≥ 1, where K is the universal constant in the isoperimetric inequality. Choose q = 2K and k to be large enough (depending only on θ) so that All that remains to be shown is Since EL 2 1 < ∞, {n −1/2 max j≤n L j } is a tight family. This shows (2.21) and thus establishes (2.20) with C = 4q + 1 and hence completes the proof of the direct part.
The converse is straightforward. For 1., note that if (2.4) holds, by the continuous mapping theorem, the right hand side being positive because in a separable Banach space a centered Gaussian law puts positive mass on any ball with positive radius centered at origin, see the discussion on page 60-61 in Ledoux and Talagrand (1991). For proving 2. we shall appeal to Theorem 4.2 in de Acosta and Giné (1979). All that needs to be shown is where ξ n := U n − E(U n ) and U n is as defined in (2.7). Note that Fix t > 0. For n large enough so that E(U n ) < t/2, it follows that By (2.8), the first term goes to zero. For the second term, notice that for n large enough, This shows (2.22) and hence completes the proof.
Recall that a Banach space B is said to by of type 2 if there is C < ∞ so that for all N ≥ 1 and zero mean independent B-valued random variables X 1 , . . . , X N , Banach spaces of type 2 are nice in the sense that every random variable X taking values there with E||X 2 < ∞ satisfies the Central Limit Theorem. In fact these are the only spaces where this is true. This is the statement of Theorem 10.5 (page 281) in Ledoux and Talagrand (1991). We would like to mention at this point that while the assumption of type 2 is a rather restrictive one, this is a fairly large class. For example, every Hilbert space and L p spaces for 2 ≤ p < ∞ are Banach spaces of type 2. We show in the following result that (2.4) can be extended to these spaces.
Theorem 2.2. If B is of type 2 and the model with power law tails (1.2) is in the hard truncation regime, then there is a Gaussian measure γ on B such that B −1 n (S n − ES n ) ⇒ γ The characteristic function of γ is given by (2.5).
Proof. In view of Lemma 2.1 and using similar arguments as in the proof of Theorem 2.1, it suffices to prove that {L(Z n )} is relatively compact where the definition of Z n (and Y nj ) is exactly the same as in the proof of the latter theorem. Choose a sequence {F k } of finite dimensional subspaces satisfying (2.11). Since B is of type 2, so is B/F for any closed subspace F , with the type 2 constant not larger than that of B. Thus, there is C ∈ [0, ∞) so that Using (2.14), it follows that lim k→∞ lim sup n→∞ E T F k (Z n ) 2 = 0 which shows (2.15) and thus completes the proof.

Examples
In this section, we construct a couple of examples. In Example 1, the hypotheses of Theorem 2.1 can be verified. This helps to conclude that the result has some practical value. In Example 2, (2.4) does not hold, and hence there is a need for a result like Theorem 2.1 or Theorem 2.2.
Example 1. Let {T jk : j, k ≥ 1} be i.i.d. R-valued symmetric α-stable (SαS) random variables with 0 < α < 2, i.e., have the following characteristic function: For all j ≥ 1, define the R N -valued random variable H j as where (a j ) is a sequence of non-negative numbers satisfying and e k is the element of R N defined by (3.2) e k (n) = 1, k = n 0, otherwise.
Recall that P (|T 11 | > x) = O(x −α ); see Property 1.2.15, page 16 in Samorodnitsky and Taqqu (1994). This ensures that H 1 , H 2 , . . . are i.i.d. random variables taking values in c 0 , the space of sequences limiting to zero, endowed with the sup norm. For that purpose, assuming that ∞ j=1 a α j < ∞ would have been sufficient. However, we shall need (3.1) for other reasons. It is immediate that H 1 , and hence each H j , is a c 0 valued symmetric α-stable random variable. This, in particular, means that as n −→ ∞, It is a well-known fact in finite dimensional spaces that the above implies ∞). However, since we could not find a reference for this on Banach spaces, we briefly sketch the argument for the sake of completeness. By Theorem 6.18, page 150 in Araujo and Giné (1980) it follows that there is a measure µ on B \ {0} satisfying µ(cD) = c −α µ(D) for all c > 0 and D ⊂ B \ {0}, such that, for all A ⊂ B that is bounded away from the origin and µ(∂A) = 0. It is also known that for all δ > 0, 0 < µ({x ∈ B : x ≥ δ}) < ∞. Set Clearly, µ(∂A) = 0. Using (3.5) with this A implies that C := lim n→∞ nP ( H 1 > n 1/α ) exists, and is finite and positive. Let (x k ) be any sequence of positive numbers going to infinity. Set n k := ⌊x α k ⌋. Observe that x α k P ( H 1 > x k ) is sandwiched between n k P ( H 1 > (n k + 1) 1/α ) and (n k + 1)P ( H 1 > n 1/α k ), and that both the bounds converge to C. Thus, (3.4) follows. The letter C will be used to denote various such constants with possibly different definition throughout this section.
Let (M n ) be a sequence such that 1 ≪ M n ≪ n 1/α . Then, the truncation of H j at level M n with L ≡ 0 is As before, define the row sum by We shall show that for this set up, the hypotheses of Theorem 2.1 can be verified by purely elementary methods; the only sophisticated result that will be used is the contraction principle for finite dimensional spaces. All that needs to be shown is Note that in view of (3.4), this definition of B n differs from that in the statement of Theorem 2.1 by only a constant multiple in the limit.
The following is a sketch of how we plan to show (3.6). Define for K ≥ 0, We shall show that for all ǫ > 0, The reason that (3.8) and (3.9) suffice for (3.6) is the following. Fix ǫ > 0. Note that S n = S K,1 n + S K,2 n , and hence it follows that This shows (3.6). For n, K ≥ 1, define We start with showing that S K,1 n is stochastically bounded by U n,K , i.e., for all x > 0, (3.10) To that end, let (ε jk : j, k ≥ 1) be a family of i.i.d. Rademacher random variables, independent of the family (T jk : j, k ≥ 1). Let P ε denote the conditional probability given (T jk : j, k ≥ 1). Note that Using Theorem 4.4, page 95 in Ledoux and Talagrand (1991), it follows that This shows (3.10). By the result in one dimension (Theorem 2.2 for example), it follows that for all k ≥ 1, as n −→ ∞, where σ > 0 is independent of k. Thus, it follows that where G is a normal random variable with mean zero and variance σ 2 . Thus, (3.8) will follows if the following is shown: for all η > 0, Clearly, it suffices to show that which immediately follows from the Markov inequality along with (3.1). Thus, (3.8) follows.
Example 2. Fix 1 < p < 2. We first construct a bounded symmetric random variable X taking values in c 0 (the space of sequences limiting to zero, equipped with the sup norm) so that n −1/p n i=1 X i does not converge to zero in probability, where X 1 , X 2 , . . . are i.i.d. copies of X. Let (ε j : j ≥ 1) be a sequence of i.i.d. Rademacher random variables. We shall use the fact that there exists K ∈ (0, ∞) so that (3.14) P n i=1 ε i > t ≥ exp −Kt 2 /n for all n ≥ 1 and t > 0 such that n 1/2 K ≤ t ≤ K −1 n. This follows from (4.2) on page 90 in Ledoux and Talagrand (1991). Define where a j := K{log(j ∨ 2)} (1−p)/2 , j ≥ 1 , K is the constant in (3.14) and e j is as defined in (3.2). Clearly, X is a c 0 valued symmetric bounded random variable. Let X 1 , X 2 , . . . denote i.i.d. copies of X. Note that for n ≥ 1, Thus, for proving that n −1/p n k=1 X k does not converge to zero in probability, it suffices to show that To that aim, define l n := exp n 2/p , n ≥ 1 , and note that Note that Also, it is easy to see that as n −→ ∞, (3.16) log l n ∼ n 2/p . Thus, n 1/p a −1 ln ≫ n 1/2 . For n large enough, an appeal to (3.14) shows that Using (3.16), it follows that Thus, for n large enough it holds that and hence for such a n, What we have shown can be summed up as that for n large enough, Thus, (3.15) follows. where S is a R-valued (symmetric) Cauchy random variable and U is a Bernoulli(1/2) random variable such that X, S, U are all independent. We start with showing that Y is in the domain of attraction of an 1-stable law on B. Let ((X i , S i , U i ) : i ≥ 1) denote i.i.d. copies of (X, S, U ). Since X has zero mean, by Theorem 9.21 in Ledoux and Talagrand (1991), it follows that We shall show by an application of the contraction principle (Theorem 4.4 in Ledoux and Talagrand (1991)) that Let (ε j : j ≥ 1) be a sequence of i.i.d. Rademacher random variables independent of ((X i , S i , U i ) : i ≥ 1). Let P ε denote the conditional probability given ((X i , S i , U i ) : i ≥ 1). Thus for all u > 0, the inequality following by the contraction principle. This shows (3.18). By Theorem 3 on page 580 in Feller (1971), it follows that for some Cauchy random variable Z. Thus, it is immediate that For a positive number M n , is the truncation of Y i to the ball of radius M n , as defined in (1.2) with L identically equal to zero. Let We will show n −1/p S n does not converge to 0 in probability whenever M n −→ ∞. By arguments similar to those leading to (3.18), it follows that for u > 0, Note that since X is bounded and M n goes to infinity, for n large enough, Observing that if (ε 1 , ε 2 ) are i.i.d. Rademacher random variables independent of (X, S, U ), then X1(U = 0) + xS1(U = 1)1(|S| ≤ M n / x ) d = ε 1 X1(U = 0) + xε 2 |S|1(U = 1)1(|S| ≤ M n / x ) , exactly same arguments as before will show that for n large enough and u > 0, The above can be summarized as that there exists N < ∞ so that P n i=1 X i 1(U i = 0) > u ≤ 4(P ( S n > u) for all n ≥ N and u > 0. Denote and the conditional probability given U 1 , U 2 , . . . by P U . Note that X i > u 1(N n > n/3) .
Another application of the contraction principle shows that on the set {N n > n/3}, Thus, it follows that  P (N n > n/3) .
All the above calculations put together shows uniformly in u. Since n −1/p n i=1 X i does not converge to zero in probability, it follows that n −1/p S n does not converge to 0 in probability either.
The above calculations can be used to construct an example where (2.4) does not hold, in the following way. Fix 1 < p < 2 and a sequence (M n ) satisfying 1 ≪ M n ≪ n 2/p−1 . Define Y by (3.17). The argument that leads to (3.4) from (3.3), applied to Y helps us conclude from (3.19) that P ( Y > x) ∼ Cx −1 as x −→ ∞, for some C ∈ (0, ∞). Since 2/p − 1 < 1, it follows that M n ≪ n, which is a restatement of lim n→∞ nP ( Y > M n ) = ∞ .
Thus, the assumption of hard truncation is satisfied. Set L ≡ 0, Y ni to be the truncation of Y i at level M n , S n to be the row sum of the triangular array {Y ni : 1 ≤ i ≤ n} and This shows that B −1 n S n does not converge weakly, for otherwise, n −1/p S n would converge to zero in probability. Thus, (2.4) does not hold. This is an example where the claim of Theorem 2.2 does not hold. The space c 0 is not of Rademacher type p for all p > 1. Hence it was possible to construct a zero mean random variable with finite p-th moment, that does not satisfy the law of large numbers with rate n 1/p .

Acknowledgment
The author is immensely grateful to his adviser Gennady Samorodnitsky for many helpful discussions, and to Parthanil Roy for his suggestions on the layout. The author also thanks an anonymous referee for some suggestions that significantly improved the paper.