The Annals of Statistics

Note on distribution free testing for discrete distributions

Abstract

The paper proposes one-to-one transformation of the vector of components $\{Y_{in}\}_{i=1}^{m}$ of Pearson’s chi-square statistic,

$Y_{in}=\frac{\nu_{in}-np_{i}}{\sqrt{np_{i}}},\qquad i=1,\ldots,m,$

into another vector $\{Z_{in}\}_{i=1}^{m}$, which, therefore, contains the same “statistical information,” but is asymptotically distribution free. Hence any functional/test statistic based on $\{Z_{in}\}_{i=1}^{m}$ is also asymptotically distribution free. Natural examples of such test statistics are traditional goodness-of-fit statistics from partial sums $\sum_{I\leq k}Z_{in}$.

The supplement shows how the approach works in the problem of independent interest: the goodness-of-fit testing of power-law distribution with the Zipf law and the Karlin–Rouault law as particular alternatives.

Article information

Source
Ann. Statist., Volume 41, Number 6 (2013), 2979-2993.

Dates
First available in Project Euclid: 1 January 2014

https://projecteuclid.org/euclid.aos/1388545675

Digital Object Identifier
doi:10.1214/13-AOS1176

Mathematical Reviews number (MathSciNet)
MR3161454

Zentralblatt MATH identifier
1294.62095

Subjects
Primary: 62D05: Sampling theory, sample surveys 62E20: Asymptotic distribution theory
Secondary: 62E05 62F10: Point estimation

Citation

Khmaladze, Estate. Note on distribution free testing for discrete distributions. Ann. Statist. 41 (2013), no. 6, 2979--2993. doi:10.1214/13-AOS1176. https://projecteuclid.org/euclid.aos/1388545675

References

• Anderson, T. W. and Darling, D. A. (1952). Asymptotic theory of certain “goodness of fit” criteria based on stochastic processes. Ann. Math. Statistics 23 193–212.
• Choulakian, V., Lockhart, R. A. and Stephens, M. A. (1994). Cramér–von Mises statistics for discrete distributions. Canad. J. Statist. 22 125–137.
• Cramér, H. (1946). Mathematical Methods of Statistics. Princeton Univ. Press, Princeton.
• Einmahl, J. H. J. and Khmaladze, E. V. (2001). The Two-Sample Problem in $\mathbb{R}^{m}$ and Measure-Valued Martingales. Institute of Mathematical Statistics Lecture Notes—Monograph Series 36 434–463. IMS, Beachwood, OH.
• Einmahl, J. H. J. and McKeague, I. W. (1999). Confidence tubes for multiple quantile plots via empirical likelihood. Ann. Statist. 27 1348–1367.
• Fisher, R. A. (1922). On the interpretation of $\chi^{2}$ from contingency tables, and the calculation of $P$. J. R. Stat. Soc. 85 87–94.
• Fisher, R. A. (1924). Conditions under which $\chi^{2}$ measures the discrepancy between observation and hypothesis. J. R. Stat. Soc. 87 442–450.
• Goldstein, M. L., Morris, S. A. and Yen, G. G. (2004). Problems with fitting the power-law distributions. Eur. Phys. J. B 41 255–258.
• Greenwood, P. E. and Nikulin, M. S. (1996). A Guide to Chi-Squared Testing. Wiley, New York.
• Henze, N. (1996). Empirical-distribution-function goodness-of-fit tests for discrete models. Canad. J. Statist. 24 81–93.
• Kendal, M. and Stuart, A. (1963). The advanced theory of statistics, Vol. 2. C. Griffin, London. Re-printed as Vol. 2A.—Classical Inference and the Linear Model in 2009.
• Khmaladze, E. V. (1979). The use of omega-square tests for testing parametric hypotheses. Theory Probab. Appl. v.XXIV 283–302.
• Khmaladze, E. V. (2013). Supplement to “Note on distribution free testing for discrete distributions.” DOI:10.1214/13-AOS1176SUPP.
• Khmaladze, È. V. (1993). Goodness of fit problem and scanning innovation martingales. Ann. Statist. 21 798–829.
• Kolmogorov, A. N. (1933). Sulla determinazione empirica di una legge di distribuzione, Giornale dell’Istituto Italiano degli Attuari; see also Kolmogorov, A. N. (1992). Selected Works. Vol. II: Probability theory and mathematical statistics. Kluwer Academic, Dordrecht.
• Owen, A. (2001). Empirical Likelihood. Chapman & Hall, Boca Raton, FL.
• Pearson, K. (1900). On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Philosophical Magazine 50 157–175. Reprinted in Karl Pearson’s Early Statistical Papers, 1948, 339–357, Cambridge Univ. Press, Cambridge.
• Rao, C. R. (1965). Linear Statistical Inference and Its Applications. Wiley, New York.
• Rosenblatt, M. (1952). Remarks on a multivariate transformation. Ann. Math. Statistics 23 470–472.
• Smirnov, N. V. (1937). On the distribution of $\omega^{2}$-test of Mises. Mat. Sb. 2 973–993 (in Russian).
• Stigler, S. M. (1999). Statistics on the Table. Harvard Univ. Press, Cambridge, MA.
• van der Vaart, A. W. (1998). Asymptotic Statistics. Cambridge Univ. Press, Cambridge.
• Wald, A. and Wolfowitz, J. (1939). Confidence limits for continuous distribution function. Ann. Math. Statistics 10 105–118.

Supplemental materials

• Supplementary material: Supplement: Distribution free Kolmogorov–Smirnov and Cramér–von Mises tests for power-law distribution. We compare asymptotic behavior of the two classical goodness-of-fit tests based on partial sums of $Y_{in}$’s and their distribution free transformations $Z_{in}$’s and show their power under Zipf’s law and under Karlin–Rouault law as alternatives.