The Annals of Probability

The Erdos-Renyi Strong Law for Pattern Matching with a Given Proportion of Mismatches

R. Arratia and M. S. Waterman

Full-text: Open access

Abstract

Consider two random sequences $X_1 \cdots X_n$ and $Y_1 \cdots Y_n$ of i.i.d. letters in which the probability that two distinct letters match is $p > 0$. For each value $a$ between $p$ and 1, the length of the longest contiguous matching between the two sequences, requiring only a proportion $a$ of corresponding letters to match, satisfies a strong law analogous to the Erdos-Renyi law for coin tossing. The same law applies to matching between two nonoverlapping regions within a single sequence $X_1 \cdots X_n$, and a strong law with a smaller constant applies to matching between two overlapping regions within that single sequence. The method here also works to obtain the strong law for matching between multidimensional arrays, between two Markov chains and for the situation in which a given proportion of mismatches is required.

Article information

Source
Ann. Probab., Volume 17, Number 3 (1989), 1152-1169.

Dates
First available in Project Euclid: 19 April 2007

Permanent link to this document
https://projecteuclid.org/euclid.aop/1176991262

Digital Object Identifier
doi:10.1214/aop/1176991262

Mathematical Reviews number (MathSciNet)
MR1009450

Zentralblatt MATH identifier
0688.62019

JSTOR
links.jstor.org

Subjects
Primary: 62E20: Asymptotic distribution theory
Secondary: 62P10: Applications to biology and medical sciences

Keywords
Matching large deviations Ising model Potts model Hamming distance DNA sequences protein sequences

Citation

Arratia, R.; Waterman, M. S. The Erdos-Renyi Strong Law for Pattern Matching with a Given Proportion of Mismatches. Ann. Probab. 17 (1989), no. 3, 1152--1169. doi:10.1214/aop/1176991262. https://projecteuclid.org/euclid.aop/1176991262


Export citation