The Annals of Statistics

Searching for a trail of evidence in a maze

Ery Arias-Castro, Emmanuel J. Candès, Hannes Helgason, and Ofer Zeitouni

Full-text: Open access

Abstract

Consider a graph with a set of vertices and oriented edges connecting pairs of vertices. Each vertex is associated with a random variable and these are assumed to be independent. In this setting, suppose we wish to solve the following hypothesis testing problem: under the null, the random variables have common distribution N(0, 1) while under the alternative, there is an unknown path along which random variables have distribution N(μ, 1), μ> 0, and distribution N(0, 1) away from it. For which values of the mean shift μ can one reliably detect and for which values is this impossible?

Consider, for example, the usual regular lattice with vertices of the form

{(i, j) : 0≤i, −iji and j has the parity of i}

and oriented edges (i, j)→(i+1, j+s), where s=±1. We show that for paths of length m starting at the origin, the hypotheses become distinguishable (in a minimax sense) if $\mu_{m}\gg1/\sqrt{\log m}$, while they are not if μm≪1/log m. We derive equivalent results in a Bayesian setting where one assumes that all paths are equally likely; there, the asymptotic threshold is μmm−1/4.

We obtain corresponding results for trees (where the threshold is of order 1 and independent of the size of the tree), for distributions other than the Gaussian and for other graphs. The concept of the predictability profile, first introduced by Benjamini, Pemantle and Peres, plays a crucial role in our analysis.

Article information

Source
Ann. Statist., Volume 36, Number 4 (2008), 1726-1757.

Dates
First available in Project Euclid: 16 July 2008

Permanent link to this document
https://projecteuclid.org/euclid.aos/1216237298

Digital Object Identifier
doi:10.1214/07-AOS526

Mathematical Reviews number (MathSciNet)
MR2435454

Zentralblatt MATH identifier
1143.62006

Subjects
Primary: 62C20: Minimax procedures 62G10: Hypothesis testing
Secondary: 82B20: Lattice systems (Ising, dimer, Potts, etc.) and systems on graphs

Keywords
Detecting a chain of nodes in a network minimax detection Bayesian detection predictability profile of a stochastic process martingales exponential families of random variables

Citation

Arias-Castro, Ery; Candès, Emmanuel J.; Helgason, Hannes; Zeitouni, Ofer. Searching for a trail of evidence in a maze. Ann. Statist. 36 (2008), no. 4, 1726--1757. doi:10.1214/07-AOS526. https://projecteuclid.org/euclid.aos/1216237298


Export citation

References

  • [1] Ahuja, R. K., Magnanti, T. L. and Orlin, J. B. (1993). Network Flows. Theory, Algorithms and Applications. Prentice Hall, Englewood Cliffs, NJ.
  • [2] Arias-Castro, E., Donoho, D. and Huo, X. (2005). Near-optimal detection of geometric objects by fast multiscale methods. IEEE Trans. Inform. Theory 51 2402–2425.
  • [3] Arias-Castro, E., Donoho, D. and Huo, X. (2006). Adaptive multiscale detection of filamentary structures in a background of uniform random points. Ann. Statist. 34 326–349.
  • [4] Bahadur, R. and Ranga Rao, R. (1960). On deviations of the sample mean. Ann. Math. Statis. 31 1015–1027.
  • [5] Baik, J. and Rains, E. M. (2001a). The asymptotics of monotone subsequences of involutions. Duke Math. J. 109 205–281.
  • [6] Baik, J. and Rains, E. M. (2001b). Symmetrized random permutations. In Random Matrix Models and Their Applications (P. Bleher and A. Its, eds.) 1–19. Cambridge Univ. Press.
  • [7] Benjamini, I., Pemantle, R. and Peres, Y. (1998). Unpredictable paths and percolation. Ann. Probab. 26 1198–1211.
  • [8] Biggins, J. D. (1977). Martingale convergence in the branching random walk. J. Appl. Probab. 14 25–37.
  • [9] Buffet, E., Patrick, A. and Pulé, J. V. (1993). Directed polymers on trees: A martingale approach. J. Phys. A 26 1823–1834.
  • [10] Candes, E. J., Charlton, P. R. and Helgason, H. (2006). Detecting highly oscillatory signals by chirplet path pursuit. Technical report, California Institute of Technology.
  • [11] Comets, F., Shiga, T. and Yoshida, N. (2004). Probabilistic analysis of directed polymers in random environments: A review. In Stochastic Analysis on Large Scale Interacting Systems 115–142. Math. Soc. Japan, Tokyo.
  • [12] Derrida, B. and Spohn, H. (1988). Polymers on disordered trees, spin glasses, and traveling waves. J. Statist. Phys. 51 817–840.
  • [13] Donoho, D. L. and Huo, X. (2002). Beamlets and multiscale image analysis. In Multiscale and Multiresolution Methods. Lecture Notes in Comput. Sci. Eng. 20 149–196. Springer, Berlin.
  • [14] Feller, W. (1957). The numbers of zeros and of changes of sign in a symmetric random walk. Enseignement Math. (2) 3 229–235.
  • [15] Feller, W. (1968). An Introduction to Probability Theory and Its Applications I, 3rd ed. Wiley, New York.
  • [16] Glaz, J., Naus, J. and Wallenstein, S. (2001). Scan Statistics. Springer, New York.
  • [17] Häggström, O. and Mossel, E. (1998). Nearest-neighbor walks with low predictability profile and percolation in 2+ε dimensions. Ann. Probab. 26 1212–1231.
  • [18] Hoffman, C. (1998). Unpredictable nearest neighbor processes. Ann. Probab. 26 1781–1787.
  • [19] Ingster, Y. I. and Suslina, I. A. (2003). Nonparametric Goodness-of-fit Testing under Gaussian Models. Springer, New York.
  • [20] Kulldorff, M. (1997). A spatial scan statistic. Comm. Statist. Theory Methods 26 1481–1496.
  • [21] Lyons, R. (1997). A simple path to Biggins’ martingale convergence for branching random walk. In Classical and Modern Branching Processes (K. B. Athreya and P. Jagers, eds.) 217–221. Springer, New York.
  • [22] Patil, G. P., Balbus, J., Biging, G., Jaja, J., Myers, W. L. and Taillie, C. (2004). Multiscale advanced raster map analysis system: Definition, design and development. Environ. Ecol. Stat. 11 113–138.
  • [23] Pemantle, R. (1995). Tree-indexed processes. Statist. Sci. 10 200–213.
  • [24] Zhong, Y., Jain, A. and Dubuisson-Jolly, M.-P. (2000). Object tracking using deformable templates. IEEE Trans. Pattern Anal. Mach. Intell. 2 544–549.