Electronic Journal of Statistics

Semi-parametric regression estimation of the tail index

Mofei Jia, Emanuele Taufer, and Maria Michela Dickson

Full-text: Open access


Consider a distribution $F$ with regularly varying tails of index $-\alpha$. An estimation strategy for $\alpha$, exploiting the relation between the behavior of the tail at infinity and of the characteristic function at the origin, is proposed. A semi-parametric regression model does the job: a nonparametric component controls the bias and a parametric one produces the actual estimate. Implementation of the estimation strategy is quite simple as it can rely on standard software packages for generalized additive models. A generalized cross validation procedure is suggested in order to handle the bias-variance trade-off. Theoretical properties of the proposed method are derived and simulations show the performance of this estimator in a wide range of cases. An application to data sets on city sizes, facing the debated issue of distinguishing Pareto-type tails from Log-normal tails, illustrates how the proposed method works in practice.

Article information

Electron. J. Statist., Volume 12, Number 1 (2018), 224-248.

Received: May 2017
First available in Project Euclid: 12 February 2018

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G32: Statistics of extreme values; tail inference
Secondary: 62J05: Linear regression

Tail index heavy-tailed distributions regular variation empirical characteristic function Zipf’s law

Creative Commons Attribution 4.0 International License.


Jia, Mofei; Taufer, Emanuele; Dickson, Maria Michela. Semi-parametric regression estimation of the tail index. Electron. J. Statist. 12 (2018), no. 1, 224--248. doi:10.1214/18-EJS1394. https://projecteuclid.org/euclid.ejs/1518426109

Export citation


  • [1] Abramowitz, M., Stegun, I. A. (1965), Handbook of Mathematical Functions, Dover, New York.
  • [2] Beirlant, J., Vynckier, P., & Teugels, J. L. (1996). Excess functions and estimation of the extreme-value index., Bernoulli, 2(4), 293–318.
  • [3] Beirlant, J., Goegebeur, Y., Segers, J., Teugels, J. (2006), Statistics of Extremes: Theory and Applications. Wiley Series in Probability and Statistics, John Wiley & Sons, Chichester.
  • [4] Beran, J., Schell, D., Stehlík, M. (2014) The harmonic moment tail index estimator: asymptotic distribution and robustness., Annals of the Institute of Statistical Mathematics 66(1), 193–220.
  • [5] Brilhante, M. F., Gomes, M. I., & Pestana, D. (2013). A simple generalisation of the Hill estimator., Computational Statistics & Data Analysis, 57(1), 518–535.
  • [6] Caeiro, F., Gomes, M. I., Pestana, D. (2005). Direct reduction of bias of the classical Hill estimator., Revstat 3(2), 111–136.
  • [7] Chan, G., Hall, P., Poskitt, D. S. (1995) Periodogram-based estimators of fractal properties., The Annals of Statistics 23(5), 1684–1711.
  • [8] Csorgo, S., Deheuvels, P., & Mason, D. (1985). Kernel estimates of the tail index of a distribution., The Annals of Statistics, 13(3), 1050–1077.
  • [9] Danielsson, J., Jansen, D. W., De vries, C. G. (1996) The method of moments ratio estimator for the tail shape parameter., Communications in Statistics-Theory and Methods 25(4), 711–720.
  • [10] de Haan, L., Resnick, S. (1998) On asymptotic normality of the Hill estimator., Communications in Statistics Stochastic Models 14(4), 849–866.
  • [11] Dekkers, A. L. M., Einmahl, J. H. J., de Haan, L. (1989) A moment estimator for the index of an extreme-value distribution., The Annals of Statistics 17(4), 1833–1855.
  • [12] Eeckhout, J. (2004) Gibrat’s law for (all) cities., The American Economic Review 94(5), 1429–1451.
  • [13] Eeckhout, J. (2009) Gibrat’s law for (all) cities: reply., The American Economic Review 99(4), 1676–1683.
  • [14] Fraga Alves, M. (2001) A location invariant hill-type estimator., Extremes 4(3), 199–217.
  • [15] Gabaix, X., Ioannides YM (2004) The evolution of city size distributions., Handbook of Regional and Urban Economics 4, 2341–2378.
  • [16] Geweke, J., Porter-Hudak, S. (1983) The estimation and application of long memory time series models., Journal of Time Series Analysis 4(4), 221–238.
  • [17] Gomes, M. I., de Haan, L., Henriques-Rodrigues, L. (2008) Tail index estimation for heavy tailed models: accomodation of bias in weighted log-excesses., J. Royal Statistical Society B, 70(1), 31–52.
  • [18] Gomes, M. I., Figueiredo, F., & Neves, M. M. (2012). Adaptive estimation of heavy right tails: resampling-based methods in action., Extremes, 15(4), 463–489.
  • [19] Gomes, M. I., & Guillou, A. (2015). Extreme value theory and statistics of univariate extremes: a review., International Statistical Review, 83(2), 263–292.
  • [20] Gomes, M. I., Brilhante, M. F., & Pestana, D. (2016). New reduced-bias estimators of a positive extreme value index., Communications in Statistics-Simulation and Computation, 45(3), 833–862.
  • [21] Grahovac, D., Jia, M., Leonenko, N. N., Taufer, E. (2015) Asymptotic properties of the partition function and applications in tail index inference of heavy-tailed data., Statistics: A Journal of Theoretical and Applied Statistics 49, 1221–1242.
  • [22] Haeusler, E., Teugels, J. L. (1985) On asymptotic normality of Hill’s estimator for the exponent of regular variation., The Annals of Statistics 13(2), 743–756.
  • [23] Hall, P. (1982) On some simple estimates of an exponent of regular variation., Journal of the Royal Statistical Society Series B (Methodological) 44(1), 37–42.
  • [24] Hall, P., Welsh, A. H. (1985) Adaptive estimates of parameters of regular variation., The Annals of Statistics 13(1), 331–341.
  • [25] Hassler, U., Marmol, F., & Velasco, C. (2006). Residual log-periodogram inference for long-run relationships., Journal of Econometrics, 130(1), 165–207.
  • [26] Hill, B. M. (1975) A simple general approach to inference about the tail of a distribution., The Annuals of Statistics 3(5), 1163–1174.
  • [27] Hsing, T. (1991) On tail index estimation using dependent data., The Annals of Statistics 19(3), 1547–1569.
  • [28] Jia, M. (2014)., Heavy-tailed Phenomena and Tail Index Inference. Ph.D. Thesis, University of Trento.
  • [29] Kratz, M. F., Resnick, S. I. (1996) The QQ-estimator and heavy tails., Comm. Statist. Stochastic Models 12(4), 699–724.
  • [30] Levy, M. (2009) Gibrat’s law for (all) cities: comment., The American Economic Review 99(4), 1672–1675.
  • [31] Marra, G., & Wood, S.N. (2011). Practical variable selection for generalized additive models., Computational Statistics & Data Analysis, 55(7), 2372–2387.
  • [32] Mason, D. M. (1982) Laws of large numbers for sums of extreme values., The Annals of Probability 10(3), 754–764.
  • [33] McElroy, T., Politis, D.N. (2007) Moment-based tail index estimation., Journal of Statistical Planning and Inference 137(4), 1389–1406.
  • [34] Meerschaert, M. M., Scheffler, H. P. (1998) A simple robust estimator for the thickness of heavy tails., Journal of Statistical Planning and Inference 71(1), 19–34.
  • [35] Paulauskas, V., & Vaičiulis, M. (2017). A class of new tail index estimators., Annals of the Institute of Statistical Mathematics, 69(2), 461–487.
  • [36] Paulauskas, V., & Vaičiulis, M. (2013). On an improvement of Hill and some other estimators., Lithuanian Mathematical Journal, 53(3), 336–355.
  • [37] Pickands III, J. (1975) Statistical inference using extreme order statistics., The Annals of Statistics 3(1), 119–131.
  • [38] Pitman, E. J. G. (1968) On the behaviour of the characteristic function of a probability distribution in the neighbourhood of the origin., Journal of the Australian Mathematical Society 8(3), 423–443.
  • [39] Politis, D. N. (2002) A new approach on estimation of the tail index., Comptes Rendus Mathematique 335(3), 279–282.
  • [40] Resnick, S., Stǎricǎ, C. (1995) Consistency of Hill’s estimator for dependent data., Journal of Applied Probability 32(1), 139–167.
  • [41] Resnick, S., Stǎricǎ, C. (1997) Smoothing the Hill estimator., Advances in Applied Probability 29(1), 271–293.
  • [42] Robinson, P. M. (1995) Log-periodogram regression of time series with long range dependence., Ann. Statist. 23(3), 1048–1072.
  • [43] Welsh, A. H. (1986) On the use of the empirical distribution and characteristic function to estimate parameters of regular variation., Australian Journal of Statistics 28(2), 173–181.
  • [44] Wood, S. N. (2003). Thin plate regression splines., Journal of the Royal Statistical Society: Series B (Statistical Methodology), 65(1), 95–114.
  • [45] Wood, S. N. (2006)., Generalized additive models: an introduction with R. CRC press.