Electronic Journal of Statistics

Tail index estimation, concentration and adaptivity

Stéphane Boucheron and Maud Thomas

Full-text: Open access

Abstract

This paper presents an adaptive version of the Hill estimator based on Lespki’s model selection method. This simple data-driven index selection method is shown to satisfy an oracle inequality and is checked to achieve the lower bound recently derived by Carpentier and Kim. In order to establish the oracle inequality, we derive non-asymptotic variance bounds and concentration inequalities for Hill estimators. These concentration inequalities are derived from Talagrand’s concentration inequality for smooth functions of independent exponentially distributed random variables combined with three tools of Extreme Value Theory: the quantile transform, Karamata’s representation of slowly varying functions, and Rényi’s characterisation for the order statistics of exponential samples. The performance of this computationally and conceptually simple method is illustrated using Monte-Carlo simulations.

Article information

Source
Electron. J. Statist., Volume 9, Number 2 (2015), 2751-2792.

Dates
Received: March 2015
First available in Project Euclid: 18 December 2015

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1450456321

Digital Object Identifier
doi:10.1214/15-EJS1088

Mathematical Reviews number (MathSciNet)
MR3435810

Zentralblatt MATH identifier
1352.60025

Subjects
Primary: 60E15: Inequalities; stochastic orderings 60G70: Extreme value theory; extremal processes 62G30: Order statistics; empirical distribution functions 62G32: Statistics of extreme values; tail inference

Keywords
Hill estimator adaptivity Lepski’s method concentration inequalities order statistics

Citation

Boucheron, Stéphane; Thomas, Maud. Tail index estimation, concentration and adaptivity. Electron. J. Statist. 9 (2015), no. 2, 2751--2792. doi:10.1214/15-EJS1088. https://projecteuclid.org/euclid.ejs/1450456321


Export citation

References

  • J. Beirlant, Y. Goegebeur, J. Teugels, and J. Segers., Statistics of extremes. John Wiley & Sons, Ltd., 2004.
  • J. Beirlant, C. Bouquiaux, and B. Werker. Semiparametric lower bounds for tail index estimation., Journal of Statistical Planning and Inference, 136(3):705–729, 2006.
  • N. Bingham, C. Goldie, and J. Teugels., Regular variation. Cambridge University Press, 1987.
  • L. Birgé. An alternative point of view on Lepski’s method. In, State of the art in probability and statistics (Leiden, 1999), volume 36 of IMS Lecture Notes Monogr. Ser., pages 113–133. Inst. Math. Statist., 2001.
  • L. Birgé. A new lower bound for multiple hypothesis testing., IEEE Trans. Inform. Theory, 51 :1611–1615, 2005.
  • S. Bobkov and M. Ledoux. Poincaré’s inequalities and Talagrand’s concentration phenomenon for the exponential distribution., Probab. Theory Rel. Fields, 107:383–400, 1997.
  • S. Boucheron and M. Thomas. Concentration inequalities for order statistics., Elec. Commun. Probab., 17:1–12, 2012.
  • S. Boucheron, G. Lugosi, and P. Massart., Concentration inequalities. Oxford University Press, 2013.
  • A. Carpentier and A. Kim. Adaptive confidence intervals for the tail coefficient in a wide second order class of pareto models., Elec. Journ. Statist., 8 :2066–2110, 2014.
  • A. Carpentier and A. Kim. Adaptive and minimax optimal estimation of the tail coefficient., Statistica Sinica, 25 :1133–1144, 2015.
  • S. Chatterjee., Superconcentration and related topics. Springer-Verlag, 2014.
  • T. Cover and J. Thomas., Elements of Information Theory. John Wiley, 1991.
  • S. Csörgő, P. Deheuvels, and D. Mason. Kernel estimates of the tail index of a distribution., Ann. Statist., 13(3) :1050–1077, 1985.
  • J. Danielsson, L. de Haan, L. Peng, and C. G. de Vries. Using a bootstrap method to choose the sample fraction in tail index estimation., J. Multivariate Anal., 76(2):226–248, 2001.
  • D. Darling and P. Erdös. A limit theorem for the maximum of normalized sums of independent random variables., Duke Math. J, 23:143–155, 1956.
  • L. de Haan and A. Ferreira., Extreme value theory. Springer-Verlag, 2006.
  • G. Draisma, L. de Haan, L. Peng, and T. Pereira. A bootstrap-based method to achieve optimally in estimating the extreme value index., Extremes, 2:367–404, 1999.
  • H. Drees. Optimal rates of convergence for estimates of the extreme value index., Ann. Statist., 26(1):434–448, 1998a.
  • H. Drees. On smooth statistical tail functionals., Scand. J. Statist., 25(1):187–210, 1998b.
  • H. Drees. Minimax risk bounds in extreme value theory., Ann. Statist., 29(1):266–294, 2001.
  • H. Drees and E. Kaufmann. Selecting the optimal sample fraction in univariate extreme value estimation., Stochastic Process. Appl., 75(2):149–172, 1998.
  • H. Drees, L. De Haan, and S. Resnick. How to make a Hill plot., Ann. Statist., 28(1):254–274, 2000.
  • J. Geluk, L. de Haan, S. Resnick, and C. Stărică. Second-order regular variation, convolution and the central limit theorem., Stochastic Process. Appl., 69(2):139–159, 1997.
  • E. Giné and V. Koltchinskii. Concentration inequalities and asymptotic results for ratio type empirical processes., Ann. Probab., 34(3) :1143–1216, 2006.
  • I. Grama and V. Spokoiny. Statistics of extremes by oracle estimation., Ann. Statist., 36(4) :1619–1648, 2008.
  • P. Hall and I. Weissman. On the estimation of extreme tail probabilities., Ann. Statist., 25(3) :1311–1326, 1997.
  • P. Hall and A. Welsh. Adaptive estimates of parameters of regular variation., Ann. Statist., 13(1):331–341, 1985.
  • B. Hill. A simple general approach to inference about the tail of a distribution., Ann. Statist., 3 :1163–1174, 1975.
  • V. Koltchinskii., Oracle inequalities in empirical risk minimization and sparse recovery problems. Ecole d’Eté de Probabilité de Saint-Flour xxxviii, volume 2033 of Lecture Notes in Math.. Springer-Verlag, 2008.
  • M. Ledoux., The concentration of measure phenomenon. American Mathematical Society, 2001.
  • M. Ledoux and M. Talagrand., Probability in Banach Space. Springer-Verlag, 1991.
  • O. Lepski. A problem of adaptive estimation in Gaussian white noise., Teoriya Veroyatnosteui i ee Primeneniya, 35(3):459–470, 1990.
  • O. Lepski. Asymptotically minimax adaptive estimation. I. Upper bounds. Optimally adaptive estimates., Teoriya Veroyatnosteui i ee Primeneniya, 36(4):645–659, 1991.
  • O. Lepski. Asymptotically minimax adaptive estimation. II. Schemes without optimal adaptation. Adaptive estimates., Teoriya Veroyatnosteui i ee Primeneniya, 37(3):468–481, 1992.
  • O. Lepski and A. Tsybakov. Asymptotic exact nonparametric hypothesis testing in sup-norm and at a fixed point., Probab. Theory Rel. Fields, 117(1):17–48, 2000.
  • D. Mason. Laws of large numbers for sums of extreme values., Ann. Probab., 10:754–764, 1982.
  • P. Massart., Concentration inequalities and model selection. Ecole d’Eté de Probabilité de Saint-Flour xxxiv, volume 1896 of Lecture Notes in Math.. Springer-Verlag, 2007.
  • P. Mathé. The Lepski principle revisited., Inverse Problems, 22(3):L11–L15, 2006.
  • B. Maurey. Some deviation inequalities., Geometric and Functional Analysis, 1(2):188–197, 1991.
  • S. Novak. Lower bounds to the accuracy of inference on heavy tails., Bernoulli, 20(2):979–989, 2014.
  • S. Resnick., Heavy-tail phenomena: probabilistic and statistical modeling, Springer-Verlag, 2007.
  • J. Segers. Abelian and Tauberian theorems on the bias of the Hill estimator., Scand. J. Statist., 29(3):461–483, 2002.
  • M. Talagrand. A new isoperimetric inequality and the concentration of measure phenomenon. In, Geometric aspects of functional analysis (1989–90), volume 1469 of Lecture Notes in Math., pages 94–124. Springer-Verlag, 1991.
  • M. Talagrand. A new look at independence., Ann. Probab., 24:1–34, 1996a.
  • M. Talagrand. New concentration inequalities in product spaces., Inventiones Mathematicae, 126:505–563, 1996b.
  • M. Talagrand., The generic chaining. Springer-Verlag, 2005.
  • A. B. Tsybakov. Pointwise and sup-norm sharp adaptive estimation of functions on the Sobolev classes., Ann. Statist., 26(6) :2420–2469, 1998.
  • S. van de Geer., Applications of empirical process theory. Cambridge University Press, 2000.
  • H. Wickham., ggplot2: elegant graphics for data analysis. Springer-Verlag, 2009.
  • H. Wickham., Advanced R. Chapman & Hall/CRC, 2014.