Generalized likelihood ratio (GLR) test statistics are often used in the detection of spatial clustering in case-control and case-population datasets to check for a significantly large proportion of cases within some scanning window. The traditional spatial scan test statistic takes the supremum GLR value over all windows, whereas the average likelihood ratio (ALR) test statistic that we consider here takes an average of the GLR values. Numerical experiments in the literature and in this paper show that the ALR test statistic has more power compared to the spatial scan statistic. We develop in this paper accurate tail probability approximations of the ALR test statistic that allow us to by-pass computer intensive Monte Carlo procedures to estimate p-values. In models that adjust for covariates, these Monte Carlo evaluations require an initial fitting of parameters that can result in very biased p-value estimates.
References
[1] Anderson, N. H. and Titterington, D. M. (1997). Some methods for investigating spatial clustering, with epidemiological applications. J. Roy. Statist. Soc. Ser. A 160 87–105.
[2] Begun, J. M., Hall, W. J., Huang, W. M. and Wellner, J. A. (1983). Information and asymptotic efficiency in parametric–nonparametric models. Ann. Statist. 11 432–452.
Mathematical Reviews (MathSciNet):
MR696057
[3] Chan, H. P. and Tu, I. (2009). P-value computations for cluster detection with covariate adjustments. Technical report, National Univ. Singapore.
[4] Chan, H. P. and Zhang, N. R. (2009). Local average likelihood ratio test statistics with applications in genomics and change-point detection. Technical report, National Univ. Singapore.
[5] Cressie, N. (1993). Statistics for Spatial Data. Wiley, New York.
[6] Cuzick, J. and Edwards, R. (1990). Spatial clustering for inhomogeneous populations (with discussions). J. Roy. Statist. Soc. Ser. B 52 73–104.
[7] Diggle, P. J. and Chetwynd, A. G. (1991). Second-order analysis of spatial clustering for inhomogeneous populations. Biometrics 47 1155–1163.
[8] Diggle, P. J., Gatrell, A. C. and Lovett, A. A. (1990). Modelling the prevalence of cancer of the larynx in part of Lanchashire: A new methodology for spatial epidemiology. In Spatial Epidemiology. Pion, London.
[9] Diggle, P. J. and Marron, J. S. (1988). Equivalence of smoothing parameter selectors in density and intensity estimation. J. Amer. Statist. Assoc. 83 793–800.
Mathematical Reviews (MathSciNet):
MR963807
[10] Dwass, M. (1957). Modified randomization tests for nonparametric hypotheses. Ann. Math. Statist. 28 181–187.
Mathematical Reviews (MathSciNet):
MR87280
[11] Edgington, E. S. (1995). Randomization Tests, 3rd ed. Marcel Dekker, New York.
[12] Gangnon, R. and Clayton, M. (2001). A weighted average likelihood ratio test for spatial clustering of disease. Stat. Med. 20 2977–2987.
[13] Haining, R. (2003). Spatial Data Analysis: Theory and Practice. Cambridge Univ. Press, Cambridge.
[14] Karatzas, I. and Shreve, S. (1991). Brownian Motion and Stochastic Calculus. Springer, New York.
[15] Kulldorff, M. (1997). A spatial scan statistic. Comm. Statist. Theory Methods 26 1481–1496.
[16] Kulldorff, M. and Information Management Services Inc. (2009). SaTScan user guide. Available at http://www.satscan.org/techdoc.html.
[17] Kulldorff, M. and Nagarwalla, N. (1995). Spatial disease clusters: Detection and inference. Stat. Med. 14 799–810.
[18] Kulldorff, M., Tango, T. and Park, P. (2003). Power comparisons for disease clustering tests. Comput. Statist. Data Anal. 42 665–684.
[19] Lai, T. L. and Siegmund, D. (1977). A nonlinear renewal theory with applications to sequential analysis I. Ann. Statist. 5 946–954.
Mathematical Reviews (MathSciNet):
MR445599
[20] Lai, T. L. and Siegmund, D. (1979). A nonlinear renewal theory with applications to sequential analysis II. Ann. Statist. 7 60–76.
Mathematical Reviews (MathSciNet):
MR515684
[21] Loader, C. (1991). Large-deviation approximation to the distribution of scan statistics. Adv. in Appl. Probab. 23 751–771.
[22] Murphy, S. and van der Vaart, A. W. (2000). On profile likelihood. J. Amer. Statist. Assoc. 95 449–465.
[23] Naus, J. I. (1965). Clustering of random points in two dimensions. Biometrika 52 263–267.
Mathematical Reviews (MathSciNet):
MR211433
[24] Neill, D., Moore, A. W. and Cooper, G. (2006). A Bayesian spatial scan statistic. In Advances in Neural Information Processing Systems (Y. Weiss, B. Scholkopf, J. Platt, eds.) 18 1003–1010. MIT Press, Boston, MA.
[25] Patil, G. P. and Taillie, C. (2004). Upper level set scan statistic for detecting arbitrarily shaped hot-spots. Environ. Ecol. Stat. 11 183–197.
[26] Rabinowitz, D. (1994). Detecting Clusters in Disease Incidence. IMS Lecture Notes–Monograph Series 23 255–275. IMS, Hayward, CA.
[27] Rabinowitz, D. and Siegmund, D. (1997). The approximate distribution of the maximum of a smoothed Poisson random field. Statist. Sinica 7 167–180.
[28] Siegmund, D. (2001). Is peak height sufficient? Genetic Epidemiology 20 403–408.
[29] Stoyan, D. and Penttinen, A. (2000). Recent applications of point process methods in forestry studies. Statist. Sci. 15 16–78.
[30] Tango, T. and Takahashi, K. (2005). A flexibly shaped spatial scan statistic for detecting clusters. J. Internat. Health Geographics 4 4–11.
[31] Waller, L. A. and Gotway, C. A. (2004). Applied Spatial Statistics for Public Health Data. Wiley, New York.
[32] Woodroofe, M. (1978). Large deviations of the likelihood ratio statistics with applications to sequential testing. Ann. Statist. 6 72–84.
Mathematical Reviews (MathSciNet):
MR455183
[33] Woodroofe, M. (1982). Nonlinear Renewal Theory in Sequential Analysis. SIAM, Philadelphia, PA.
Mathematical Reviews (MathSciNet):
MR660065