## The Annals of Statistics

- Ann. Statist.
- Volume 38, Number 2 (2010), 1010-1033.

### Optimal and fast detection of spatial clusters with scan statistics

#### Abstract

We consider the detection of multivariate spatial clusters in the Bernoulli model with *N* locations, where the design distribution has weakly dependent marginals. The locations are scanned with a rectangular window with sides parallel to the axes and with varying sizes and aspect ratios. Multivariate scan statistics pose a statistical problem due to the multiple testing over many scan windows, as well as a computational problem because statistics have to be evaluated on many windows. This paper introduces methodology that leads to both statistically optimal inference and computationally efficient algorithms. The main difference to the traditional calibration of scan statistics is the concept of grouping scan windows according to their sizes, and then applying different critical values to different groups. It is shown that this calibration of the scan statistic results in optimal inference for spatial clusters on both small scales and on large scales, as well as in the case where the cluster lives on one of the marginals. Methodology is introduced that allows for an efficient approximation of the set of all rectangles while still guaranteeing the statistical optimality results described above. It is shown that the resulting scan statistic has a computational complexity that is almost linear in *N*.

#### Article information

**Source**

Ann. Statist., Volume 38, Number 2 (2010), 1010-1033.

**Dates**

First available in Project Euclid: 19 February 2010

**Permanent link to this document**

https://projecteuclid.org/euclid.aos/1266586621

**Digital Object Identifier**

doi:10.1214/09-AOS732

**Mathematical Reviews number (MathSciNet)**

MR2604703

**Zentralblatt MATH identifier**

1183.62076

**Subjects**

Primary: 62G10: Hypothesis testing

Secondary: 62H30: Classification and discrimination; cluster analysis [See also 68T10, 91C20]

**Keywords**

Scan statistic Bernoulli model optimal detection multiscale inference fast algorithm concentration inequality

#### Citation

Walther, Guenther. Optimal and fast detection of spatial clusters with scan statistics. Ann. Statist. 38 (2010), no. 2, 1010--1033. doi:10.1214/09-AOS732. https://projecteuclid.org/euclid.aos/1266586621