The Annals of Applied Probability

Gaussian limits for generalized spacings

Yu. Baryshnikov, Mathew D. Penrose, and J. E. Yukich

Full-text: Open access

Abstract

Nearest neighbor cells in Rd, d∈ℕ, are used to define coefficients of divergence (φ-divergences) between continuous multivariate samples. For large sample sizes, such distances are shown to be asymptotically normal with a variance depending on the underlying point density. In d=1, this extends classical central limit theory for sum functions of spacings. The general results yield central limit theorems for logarithmic k-spacings, information gain, log-likelihood ratios and the number of pairs of sample points within a fixed distance of each other.

Article information

Source
Ann. Appl. Probab., Volume 19, Number 1 (2009), 158-185.

Dates
First available in Project Euclid: 20 February 2009

Permanent link to this document
https://projecteuclid.org/euclid.aoap/1235140336

Digital Object Identifier
doi:10.1214/08-AAP537

Mathematical Reviews number (MathSciNet)
MR2498675

Zentralblatt MATH identifier
1159.60315

Subjects
Primary: 60F05: Central limit and other weak theorems 60D05: Geometric probability and stochastic geometry [See also 52A22, 53C65] 62H11: Directional data; spatial statistics

Keywords
φ-divergence central limit theorems spacing statistics logarithmic spacings information gain log-likelihood

Citation

Baryshnikov, Yu.; Penrose, Mathew D.; Yukich, J. E. Gaussian limits for generalized spacings. Ann. Appl. Probab. 19 (2009), no. 1, 158--185. doi:10.1214/08-AAP537. https://projecteuclid.org/euclid.aoap/1235140336


Export citation

References

  • [1] Ali, S. M. and Silvey, S. D. (1965). Association between random variables and the dispersion of a Radon–Nikodým derivative. J. Roy. Statist. Soc. Ser. B 27 100–107.
  • [2] Ali, S. M. and Silvey, S. D. (1965). A further result on the relevance of the dispersion of a Radon–Nikodým derivative to the problem of measuring association. J. Roy. Statist. Soc. Ser. B 27 108–110.
  • [3] Ali, S. M. and Silvey, S. D. (1966). A general class of coefficients of divergence of one distribution from another. J. Roy. Statist. Soc. Ser. B 28 131–142.
  • [4] Baryshnikov, Y., Penrose, M. and Yukich, J. E. (2008). Gaussian limits for generalized spacings (extended version). Available at http://arxiv.org./abs/0804.4123.
  • [5] Baryshnikov, Y. and Yukich, J. E. (2005). Gaussian limits for random measures in geometric probability. Ann. Appl. Probab. 15 213–253.
  • [6] Bickel, P. J. and Breiman, L. (1983). Sums of functions of nearest neighbor distances, moment bounds, limit theorems and a goodness of fit test. Ann. Probab. 11 185–214.
  • [7] Blumenthal, S. (1968). Logarithms of sample spacings. SIAM J. Appl. Math. 16 1184–1191.
  • [8] Cressie, N. (1976). On the logarithms of high-order spacings. Biometrika 63 343–355.
  • [9] Csiszár, I. (1963). Eine informationstheoretische Ungleichung und ihre Anwendung auf den Beweis der Ergodizität von Markoffschen Ketten. Magyar Tud. Akad. Mat. Kutató Int. Közl. 8 85–108.
  • [10] Csiszár, I. (1967). Information-type measures of difference of probability distributions and indirect observations. Studia Sci. Math. Hungar. 2 299–318.
  • [11] Csiszár, I. (1978). Information measures: A critical survey. In Transactions of the Seventh Prague Conference on Information Theory, Statistical Decision Functions and the Eighth European Meeting of Statisticians (Tech. Univ. Prague, Prague, 1974) B 73–86. Academia, Prague.
  • [12] Czekała, F. (1993). The asymptotic distributions of statistics based on logarithms of spacings. Zastos. Mat. 21 511–519.
  • [13] Darling, D. A. (1953). On a class of problems related to the random division of an interval. Ann. Math. Statistics 24 239–253.
  • [14] Deheuvels, P., Einmahl, J. H. J., Mason, D. M. and Ruymgaart, F. H. (1988). The almost sure behavior of maximal and minimal multivariate kn-spacings. J. Multivariate Anal. 24 155–176.
  • [15] Dudewicz, E. J. and Van der Meulen, E. C. (1987). The empiric entropy, a new approach to nonparametric density estimation. In New Perspectives in Theoretical and Applied Statistics (M. I. Puri, J. Vilaplana and M. Wertz, eds.) 202–227. Wiley, New York.
  • [16] L’Écuyer, P., Cordeau, J.-F. and Simard, R. (2000). Close-point spatial tests and their application to random number generators. Oper. Res. 48 308–317.
  • [17] Ekström, M. (2006). Sum-functions of spacings of increasing order. J. Statist. Plann. Inference 136 2535–2546.
  • [18] Gebert, J. R. and Kale, B. K. (1969). Goodness of fit tests based on discriminatory information. Statist. Hefte. 3 192–200.
  • [19] Ghosh, K. and Jammalamadaka, S. R. (2001). A general estimation method using spacings. J. Statist. Plann. Inference 93 71–82.
  • [20] Hall, P. (1986). On powerful distributional tests based on sample spacings. J. Multivariate Anal. 19 201–224.
  • [21] Holst, L. (1979). Asymptotic normality of sum-functions of spacings. Ann. Probab. 7 1066–1072.
  • [22] Holst, L. and Rao, J. S. (1981). Asymptotic spacings theory with applications to the two-sample problem. Canad. J. Statist. 9 79–89.
  • [23] Jammalamadaka, S. R. and Zhou, X. (1990). Some goodness of fit tests in higher dimensions based on interpoint distances. In Proceedings of the R. C. Bose Symposium on Probability, Statistics and Design of Experiments (R. R. Bahadur, ed.) 391–404. Wiley Eastern, New Delhi.
  • [24] Jiménez, R. and Yukich, J. E. (2002). Asymptotics for statistical distances based on Voronoi tessellations. J. Theoret. Probab. 15 503–541.
  • [25] Jiménez, R. and Yukich, J. E. (2005). Statistical distances based on Euclidean graphs. In Recent Advances in Applied Probability (R. Baeza-Yates et al., eds.) 223–239. Springer, New York.
  • [26] Khashimov, S. A. (1989). Asymptotic properties of functions of spacings. Theory Probab. Appl. 34 298–306.
  • [27] Kuo, M. and Rao, J. S. (1984). Limit theory and efficiencies for tests based on higher-order spacings. In Statistics: Applications and New Directions (Calcutta, 1981) 333–352. Indian Statist. Inst., Calcutta.
  • [28] Mirakhmedov, S. A. (2005). Lower estimation of the remainder term in the CLT for a sum of the functions of k-spacings. Statist. Probab. Lett. 73 411–424.
  • [29] Moran, P. A. P. (1947). The random division of an interval. Suppl. J. Roy. Statist. Soc. 9 92–98.
  • [30] Penrose, M. (2003). Random Geometric Graphs. Oxford Studies in Probability 5. Oxford Univ. Press, Oxford.
  • [31] Penrose, M. D. (2007). Laws of large numbers in stochastic geometry with statistical applications. Bernoulli 13 1124–1150.
  • [32] Penrose, M. D. (2007). Gaussian limits for random geometric measures. Electron. J. Probab. 12 989–1035 (electronic).
  • [33] Penrose, M. D. and Yukich, J. E. (2005). Normal approximation in geometric probability. In Stein’s Method and Applications. Lect. Notes Ser. Inst. Math. Sci. Natl. Univ. Singap. 5 37–58. Singapore Univ. Press, Singapore.
  • [34] del Pino, G. E. (1979). On the asymptotic distribution of k-spacings with applications to goodness-of-fit tests. Ann. Statist. 7 1058–1065.
  • [35] Pyke, R. (1965). Spacings. J. Roy. Statist. Soc. Ser. B 27 395–449.
  • [36] Read, T. R. C. and Cressie, N. A. C. (1988). Goodness-of-fit Statistics for Discrete Multivariate Data. Springer, New York.
  • [37] Rényi, A. (1961). On measures of entropy and information. In Proc. 4th Berkeley Sympos. Math. Statist. Probab. I 547–561. Univ. California Press, Berkeley.
  • [38] Schilling, M. F. (1983). Goodness of fit testing in Rm based on the weighted empirical distribution of certain nearest neighbor statistics. Ann. Statist. 11 1–12.
  • [39] Shao, Y. and Jiménez, R. (1998). Entropy for random partitions and its applications. J. Theoret. Probab. 11 417–433.
  • [40] Shao, Y. and Hahn, M. G. (1995). Limit theorems for the logarithm of sample spacings. Statist. Probab. Lett. 24 121–132.
  • [41] van Es, B. (1992). Estimating functionals related to a density by a class of statistics based on spacings. Scand. J. Statist. 19 61–72.
  • [42] Weiss, L. (1957). The asymptotic power of certain tests of fit based on sample spacings. Ann. Math. Statist. 28 783–786.
  • [43] Zhou, S. and Jammalamadaka, S. R. (1993). Goodness of fit in multidimensions based on nearest neighbour distances. J. Nonparametr. Statist. 2 271–284.