December 2011 Distribution of clump statistics for a collection of words
Donald E. K. Martin, Deidra A. Coleman
Author Affiliations +
J. Appl. Probab. 48(4): 1049-1059 (December 2011). DOI: 10.1239/jap/1324046018

Abstract

We give an efficient method based on minimal deterministic finite automata for computing the exact distribution of the number of occurrences and coverage of clumps (maximal sets of overlapping words) of a collection of words. In addition, we compute probabilities for the number of h-clumps, word groupings where gaps of a maximal length h between occurrences of words are allowed. The method facilitates the computation of p-values for testing procedures. A word is allowed to contain other words of the collection, making the computation more general, but also more difficult. The underlying sequence is assumed to be Markovian of an arbitrary order.

Citation

Download Citation

Donald E. K. Martin. Deidra A. Coleman. "Distribution of clump statistics for a collection of words." J. Appl. Probab. 48 (4) 1049 - 1059, December 2011. https://doi.org/10.1239/jap/1324046018

Information

Published: December 2011
First available in Project Euclid: 16 December 2011

zbMATH: 1250.62012
MathSciNet: MR2896667
Digital Object Identifier: 10.1239/jap/1324046018

Subjects:
Primary: 60E05
Secondary: 60J05

Keywords: Clumps of a pattern , coarsest partition , deterministic finite automaton

Rights: Copyright © 2011 Applied Probability Trust

JOURNAL ARTICLE
11 PAGES

This article is only available to subscribers.
It is not available for individual sale.
+ SAVE TO MY LIBRARY

Vol.48 • No. 4 • December 2011
Back to Top