The Annals of Statistics

Detection of spatial clustering with average likelihood ratio test statistics

Hock Peng Chan

Source: Ann. Statist. Volume 37, Number 6B (2009), 3985-4010.

Abstract

Generalized likelihood ratio (GLR) test statistics are often used in the detection of spatial clustering in case-control and case-population datasets to check for a significantly large proportion of cases within some scanning window. The traditional spatial scan test statistic takes the supremum GLR value over all windows, whereas the average likelihood ratio (ALR) test statistic that we consider here takes an average of the GLR values. Numerical experiments in the literature and in this paper show that the ALR test statistic has more power compared to the spatial scan statistic. We develop in this paper accurate tail probability approximations of the ALR test statistic that allow us to by-pass computer intensive Monte Carlo procedures to estimate p-values. In models that adjust for covariates, these Monte Carlo evaluations require an initial fitting of parameters that can result in very biased p-value estimates.

Primary Subjects: 60F10, 62G10
Secondary Subjects: 60G55
Keywords: Average likelihood ratio; change of measure; generalized likelihood ratio; logistic model; moderate deviations; scan statistic; spatial clustering

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber.
If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.aos/1256303534
Digital Object Identifier: doi:10.1214/09-AOS701

References

[1] Anderson, N. H. and Titterington, D. M. (1997). Some methods for investigating spatial clustering, with epidemiological applications. J. Roy. Statist. Soc. Ser. A 160 87–105.
[2] Begun, J. M., Hall, W. J., Huang, W. M. and Wellner, J. A. (1983). Information and asymptotic efficiency in parametric–nonparametric models. Ann. Statist. 11 432–452.
Mathematical Reviews (MathSciNet): MR696057
Zentralblatt MATH: 0526.62045
Digital Object Identifier: doi:10.1214/aos/1176346151
Project Euclid: euclid.aos/1176346151
[3] Chan, H. P. and Tu, I. (2009). P-value computations for cluster detection with covariate adjustments. Technical report, National Univ. Singapore.
[4] Chan, H. P. and Zhang, N. R. (2009). Local average likelihood ratio test statistics with applications in genomics and change-point detection. Technical report, National Univ. Singapore.
[5] Cressie, N. (1993). Statistics for Spatial Data. Wiley, New York.
Mathematical Reviews (MathSciNet): MR1239641
Zentralblatt MATH: 0799.62002
[6] Cuzick, J. and Edwards, R. (1990). Spatial clustering for inhomogeneous populations (with discussions). J. Roy. Statist. Soc. Ser. B 52 73–104.
Mathematical Reviews (MathSciNet): MR1049303
[7] Diggle, P. J. and Chetwynd, A. G. (1991). Second-order analysis of spatial clustering for inhomogeneous populations. Biometrics 47 1155–1163.
[8] Diggle, P. J., Gatrell, A. C. and Lovett, A. A. (1990). Modelling the prevalence of cancer of the larynx in part of Lanchashire: A new methodology for spatial epidemiology. In Spatial Epidemiology. Pion, London.
[9] Diggle, P. J. and Marron, J. S. (1988). Equivalence of smoothing parameter selectors in density and intensity estimation. J. Amer. Statist. Assoc. 83 793–800.
Mathematical Reviews (MathSciNet): MR963807
Zentralblatt MATH: 0662.62036
Digital Object Identifier: doi:10.2307/2289308
[10] Dwass, M. (1957). Modified randomization tests for nonparametric hypotheses. Ann. Math. Statist. 28 181–187.
Mathematical Reviews (MathSciNet): MR87280
Digital Object Identifier: doi:10.1214/aoms/1177707045
Project Euclid: euclid.aoms/1177707045
[11] Edgington, E. S. (1995). Randomization Tests, 3rd ed. Marcel Dekker, New York.
[12] Gangnon, R. and Clayton, M. (2001). A weighted average likelihood ratio test for spatial clustering of disease. Stat. Med. 20 2977–2987.
[13] Haining, R. (2003). Spatial Data Analysis: Theory and Practice. Cambridge Univ. Press, Cambridge.
[14] Karatzas, I. and Shreve, S. (1991). Brownian Motion and Stochastic Calculus. Springer, New York.
Mathematical Reviews (MathSciNet): MR1121940
[15] Kulldorff, M. (1997). A spatial scan statistic. Comm. Statist. Theory Methods 26 1481–1496.
Mathematical Reviews (MathSciNet): MR1456844
Zentralblatt MATH: 0920.62116
Digital Object Identifier: doi:10.1080/03610929708831995
[16] Kulldorff, M. and Information Management Services Inc. (2009). SaTScan user guide. Available at http://www.satscan.org/techdoc.html.
[17] Kulldorff, M. and Nagarwalla, N. (1995). Spatial disease clusters: Detection and inference. Stat. Med. 14 799–810.
[18] Kulldorff, M., Tango, T. and Park, P. (2003). Power comparisons for disease clustering tests. Comput. Statist. Data Anal. 42 665–684.
Mathematical Reviews (MathSciNet): MR1977177
[19] Lai, T. L. and Siegmund, D. (1977). A nonlinear renewal theory with applications to sequential analysis I. Ann. Statist. 5 946–954.
Mathematical Reviews (MathSciNet): MR445599
Zentralblatt MATH: 0378.62069
Digital Object Identifier: doi:10.1214/aos/1176343950
Project Euclid: euclid.aos/1176343950
[20] Lai, T. L. and Siegmund, D. (1979). A nonlinear renewal theory with applications to sequential analysis II. Ann. Statist. 7 60–76.
Mathematical Reviews (MathSciNet): MR515684
Zentralblatt MATH: 0409.62074
Digital Object Identifier: doi:10.1214/aos/1176344555
Project Euclid: euclid.aos/1176344555
[21] Loader, C. (1991). Large-deviation approximation to the distribution of scan statistics. Adv. in Appl. Probab. 23 751–771.
Mathematical Reviews (MathSciNet): MR1133726
Zentralblatt MATH: 0741.60036
Digital Object Identifier: doi:10.2307/1427674
[22] Murphy, S. and van der Vaart, A. W. (2000). On profile likelihood. J. Amer. Statist. Assoc. 95 449–465.
Mathematical Reviews (MathSciNet): MR1803168
Zentralblatt MATH: 0995.62033
Digital Object Identifier: doi:10.2307/2669386
[23] Naus, J. I. (1965). Clustering of random points in two dimensions. Biometrika 52 263–267.
Mathematical Reviews (MathSciNet): MR211433
Zentralblatt MATH: 0132.39702
Digital Object Identifier: doi:10.2307/2333829
[24] Neill, D., Moore, A. W. and Cooper, G. (2006). A Bayesian spatial scan statistic. In Advances in Neural Information Processing Systems (Y. Weiss, B. Scholkopf, J. Platt, eds.) 18 1003–1010. MIT Press, Boston, MA.
Mathematical Reviews (MathSciNet): MR2441315
[25] Patil, G. P. and Taillie, C. (2004). Upper level set scan statistic for detecting arbitrarily shaped hot-spots. Environ. Ecol. Stat. 11 183–197.
[26] Rabinowitz, D. (1994). Detecting Clusters in Disease Incidence. IMS Lecture Notes–Monograph Series 23 255–275. IMS, Hayward, CA.
Mathematical Reviews (MathSciNet): MR1477929
Zentralblatt MATH: 1158.60352
Digital Object Identifier: doi:10.1214/lnms/1215463129
[27] Rabinowitz, D. and Siegmund, D. (1997). The approximate distribution of the maximum of a smoothed Poisson random field. Statist. Sinica 7 167–180.
Mathematical Reviews (MathSciNet): MR1441152
Zentralblatt MATH: 0895.60053
[28] Siegmund, D. (2001). Is peak height sufficient? Genetic Epidemiology 20 403–408.
[29] Stoyan, D. and Penttinen, A. (2000). Recent applications of point process methods in forestry studies. Statist. Sci. 15 16–78.
Mathematical Reviews (MathSciNet): MR1842237
Digital Object Identifier: doi:10.1214/ss/1009212674
Project Euclid: euclid.ss/1009212674
[30] Tango, T. and Takahashi, K. (2005). A flexibly shaped spatial scan statistic for detecting clusters. J. Internat. Health Geographics 4 4–11.
[31] Waller, L. A. and Gotway, C. A. (2004). Applied Spatial Statistics for Public Health Data. Wiley, New York.
Mathematical Reviews (MathSciNet): MR2075123
Zentralblatt MATH: 1057.62106
[32] Woodroofe, M. (1978). Large deviations of the likelihood ratio statistics with applications to sequential testing. Ann. Statist. 6 72–84.
Mathematical Reviews (MathSciNet): MR455183
Zentralblatt MATH: 0386.62019
Digital Object Identifier: doi:10.1214/aos/1176344066
Project Euclid: euclid.aos/1176344066
[33] Woodroofe, M. (1982). Nonlinear Renewal Theory in Sequential Analysis. SIAM, Philadelphia, PA.
Mathematical Reviews (MathSciNet): MR660065
Zentralblatt MATH: 0487.62062

2009 © Institute of Mathematical Statistics