Open Access
May, 1991 Tight Bounds and Approximations for Scan Statistic Probabilities for Discrete Data
Joseph Glaz, Joseph I. Naus
Ann. Appl. Probab. 1(2): 306-318 (May, 1991). DOI: 10.1214/aoap/1177005940

Abstract

Let $X_1, X_2, \ldots$ be a sequence of independently and identically distributed integer-valued random variables. Let $Y_{t - m + 1,t}$ for $t = m, m + 1,\ldots$ denote a moving sum of $m$ consecutive $X_i$'s. Let $N_{m,T} = \max_{m \leq t \leq T} \{Y_{t - m + 1,t}\}$ and let $\tau_{k,m}$ be the waiting time until the moving sum of $X_i$'s in a scanning window of $m$ trials is as large as $k$. We derive tight bounds for the equivalent probabilities $P(\tau_{k,m} > T) = P(N_{m,T} < k)$. We apply the bounds for two problems in molecular biology: the distribution of the length of the longest almost-matching subsequence in aligned amino acid sequences and the distribution of the largest net charge within any $m$ consecutive positions in a charged alphabet string.

Citation

Download Citation

Joseph Glaz. Joseph I. Naus. "Tight Bounds and Approximations for Scan Statistic Probabilities for Discrete Data." Ann. Appl. Probab. 1 (2) 306 - 318, May, 1991. https://doi.org/10.1214/aoap/1177005940

Information

Published: May, 1991
First available in Project Euclid: 19 April 2007

zbMATH: 0738.60039
MathSciNet: MR1102323
Digital Object Identifier: 10.1214/aoap/1177005940

Subjects:
Primary: 60F10
Secondary: 60F99

Keywords: clustering probabilities , Longest matching subsequences , scan statistics

Rights: Copyright © 1991 Institute of Mathematical Statistics

Vol.1 • No. 2 • May, 1991
Back to Top