Open Access
August 2020 Matching strings in encoded sequences
Adriana Coutinho, Rodrigo Lambert, Jérôme Rousseau
Bernoulli 26(3): 2021-2050 (August 2020). DOI: 10.3150/19-BEJ1181


We investigate the length of the longest common substring for encoded sequences and its asymptotic behaviour. The main result is a strong law of large numbers for a re-scaled version of this quantity, which presents an explicit relation with the Rényi entropy of the source. We apply this result to the zero-inflated contamination model and the stochastic scrabble. In the case of dynamical systems, this problem is equivalent to the shortest distance between two observed orbits and its limiting relationship with the correlation dimension of the pushforward measure. An extension to the shortest distance between orbits for random dynamical systems is also provided.


Download Citation

Adriana Coutinho. Rodrigo Lambert. Jérôme Rousseau. "Matching strings in encoded sequences." Bernoulli 26 (3) 2021 - 2050, August 2020.


Received: 1 April 2019; Revised: 1 October 2019; Published: August 2020
First available in Project Euclid: 27 April 2020

zbMATH: 07193951
MathSciNet: MR4091100
Digital Object Identifier: 10.3150/19-BEJ1181

Keywords: Coding , correlation dimension , Random dynamical systems , Rényi entropy , shortest distance , string matching

Rights: Copyright © 2020 Bernoulli Society for Mathematical Statistics and Probability

Vol.26 • No. 3 • August 2020
Back to Top