Let Ln be the length of the longest common subsequence of two independent i.i.d. sequences of Bernoulli variables of length n. We prove that the order of the standard deviation of Ln is
, provided the parameter of the Bernoulli variables is small enough. This validates Waterman’s conjecture in this situation [Philos. Trans. R. Soc. Lond. Ser. B 344 (1994) 383–390]. The order conjectured by Chvatal and Sankoff [J. Appl. Probab. 12 (1975) 306–315], however, is different.
Full-text: Access denied (no subscription detected)
We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber.
If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription.
Read more about accessing full-text
References
[1] Alexander, K. S. (1994). The rate of convergence of the mean length of the longest common subsequence. Ann. Appl. Probab. 4 1074–1082.
[2] Amsalu, S., Matzinger, H. and Popov, S. (2007). Macroscopic non-uniqueness and transversal fluctuation in optimal random sequence alignment. ESAIM Probab. Stat. 11 281–300.
[3] Arratia, R. and Waterman, M. S. (1994). A phase transition for the score in matching random sequences allowing deletions. Ann. Appl. Probab. 4 200–225.
[4] Baeza-Yates, R. A., Gavaldà, R., Navarro, G. and Scheihing, R. (1999). Bounding the expected length of longest common subsequences and forests. Theory Comput. Syst. 32 435–452.
[5] Bonetto, F. and Matzinger, H. (2006). Fluctuations of the longest common subsequence in the asymmetric case of 2- and 3-letter alphabets. ALEA Lat. Am. J. Probab. Math. Stat. 2 195–216 (electronic).
[6] Boutet de Monvel, J. (1999). Extensive simulations for longest common subsequences. Eur. Phys. J. B 7 293–308.
[7] Chvatal, V. and Sankoff, D. (1975). Longest common subsequences of two random sequences. J. Appl. Probab. 12 306–315.
[8] Devroye, L., Györfi, L. and Lugosi, G. (1996). A Probabilistic Theory of Pattern Recognition. Applications of Mathematics (New York) 31. Springer, New York.
[9] Hauser, R., Martínez, S. and Matzinger, H. (2006). Large deviations-based upper bounds on the expected relative length of longest common subsequences. Adv. in Appl. Probab. 38 827–852.
[10] Houdré, C., Lember, J. and Matzinger, H. (2006). On the longest common increasing binary subsequence. C. R. Math. Acad. Sci. Paris 343 589–594.
[11] Kiwi, M., Loebl, M. and Matoušek, J. (2005). Expected length of the longest common subsequence for large alphabets. Adv. Math. 197 480–498.
[12] Lember, J., Matzinger, H. and Vollmer, A. (2007). Path properties of LCS-optimal alignments. Submitted.
[13] Matzinger, H., Lember, J. and Durringer, C. (2007). Deviation from mean in sequence comparison with a periodic sequence. ALEA Lat. Am. J. Probab. Math. Stat. 3 1–29 (electronic).
[14] Steele, J. M. (1986). An Efron–Stein inequality for nonsymmetric statistics. Ann. Statist. 14 753–758.
[15] Waterman, M. S. (1994). Estimating statistical significance of sequence alignments. Philos. Trans. R. Soc. Lond. Ser. B 344 383–390.
[16] Waterman, M. S. (1995). Introduction to Computational Biology. Chapman & Hall, London.
[17] Waterman, M. S. and Vingron, M. (1994). Sequence comparison significance and Poisson approximation. Statist. Sci. 9 367–381.