Exponential tail bounds for sums play an important role in statistics, but the example of the t-statistic shows that the exponential tail decay may be lost when population parameters need to be estimated from the data. However, it turns out that if Studentizing is accompanied by estimating the location parameter in a suitable way, then the t-statistic regains the exponential tail behavior. Motivated by this example, the paper analyzes other ways of empirically standardizing sums and establishes tail bounds that are sub-Gaussian or even closer to normal for the following settings: Standardization with Studentized contrasts for normal observations, standardization with the log likelihood ratio statistic for observations from an exponential family, and standardization via self-normalization for observations from a symmetric distribution with unknown center of symmetry. The latter standardization gives rise to a novel scan statistic for heteroscedastic data whose asymptotic power is analyzed in the case where the observations have a log-concave distribution.
Research supported by NSF grants DMS-1501767 and DMS-1916074
The author would like to thank a referee for comments about an exponential tail bound for the t-statistic.
"Tail bounds for empirically standardized sums." Electron. J. Statist. 16 (1) 2406 - 2431, 2022. https://doi.org/10.1214/22-EJS1995