Abstract and Applied Analysis

Shannon Information and Power Law Analysis of the Chromosome Code

J. A. Tenreiro Machado

Full-text: Open access

Abstract

This paper studies the information content of the chromosomes of twenty-three species. Several statistics considering different number of bases for alphabet character encoding are derived. Based on the resulting histograms, word delimiters and character relative frequencies are identified. The knowledge of this data allows moving along each chromosome while evaluating the flow of characters and words. The resulting flux of information is captured by means of Shannon entropy. The results are explored in the perspective of power law relationships allowing a quantitative evaluation of the DNA of the species.

Article information

Source
Abstr. Appl. Anal., Volume 2012, Special Issue (2012), Article ID 439089, 13 pages.

Dates
First available in Project Euclid: 5 April 2013

Permanent link to this document
https://projecteuclid.org/euclid.aaa/1365168345

Digital Object Identifier
doi:10.1155/2012/439089

Mathematical Reviews number (MathSciNet)
MR2975354

Zentralblatt MATH identifier
1253.94035

Citation

Tenreiro Machado, J. A. Shannon Information and Power Law Analysis of the Chromosome Code. Abstr. Appl. Anal. 2012, Special Issue (2012), Article ID 439089, 13 pages. doi:10.1155/2012/439089. https://projecteuclid.org/euclid.aaa/1365168345


Export citation

References

  • R. T. Schuh and A. V. Z. Brower, Biological Systematics: Principles and Applications, Cornell University Press, 2nd edition, 2009.
  • H. Seitz, Analytics of Protein-DNA Interactions, Advances in Biochemical Engineering Biotechnology, Springer, 2007.
  • H. Pearson, “What is a gene?” Nature, vol. 441, no. 7092, pp. 398–401, 2006.
  • UCSC Genome Bioinformatics, http://hgdownload.cse.ucsc.edu/downloads.html.
  • G. E. Sims, S. R. Jun, G. A. Wu, and S. H. Kim, “Alignment-free genome comparison with feature frequency profiles (FFP) and optimal resolutions,” Proceedings of the National Academy of Sciences of the United States of America, vol. 106, no. 8, pp. 2677–2682, 2009.
  • W. J. Murphy, T. H. Pringle, T. A. Crider, M. S. Springer, and W. Miller, “Using genomic data to unravel the root of the placental mammal phylogeny,” Genome Research, vol. 17, no. 4, pp. 413–421, 2007.
  • H. Zhao and G. Bourque, “Recovering genome rearrangements in the mammalian phylogeny,” Genome Research, vol. 19, no. 5, pp. 934–942, 2009.
  • A. B. Prasad, M. W. Allard, and E. D. Green, “Confirming the phylogeny of mammals by use of large comparative sequence data sets,” Molecular Biology and Evolution, vol. 25, no. 9, pp. 1795–1808, 2008.
  • I. Ebersberger, P. Galgoczy, S. Taudien, S. Taenzer, M. Platzer, and A. Von Haeseler, “Mapping human genetic ancestry,” Molecular Biology and Evolution, vol. 24, no. 10, pp. 2266–2276, 2007.
  • C. W. Dunn, A. Hejnol, D. Q. Matus et al., “Broad phylogenomic sampling improves resolution of the animal tree of life,” Nature, vol. 452, no. 7188, pp. 745–749, 2008.
  • J. A. T. Machado, A. C. Costa, and M. D. Quelhas, “Fractional dynamics in DNA,” Communications in Nonlinear Science and Numerical Simulation, vol. 16, no. 8, pp. 2963–2969, 2011.
  • A. M. Costa, J. T. Machado, and M. D. Quelhas, “Histogram-based DNA analysis for the visualization of chromosome, genome and species information,” Bioinformatics, vol. 27, no. 9, pp. 1207–1214, 2011.
  • J. A. T. Machado, A. C. Costa, and M. D. Quelhas, “Entropy analysis of the DNA code dynamics in human chromosomes,” Computers & Mathematics with Applications, vol. 62, no. 3, pp. 1612–1617, 2011.
  • J. A. T. Machado, A. C. Costa, and M. D. Quelhas, “Analysis and visualization of chromosome infor-mation,” Gene, vol. 491, no. 1, pp. 81–87, 2012.
  • M. Kimura, The Neutral Theory of Molecular Evolution, Cambridge University Press, Cambridge, Mass, USA, 1983.
  • P. J. Deschavanne, A. Giron, J. Vilain, G. Fagot, and B. Fertit, “Genomic signature: characterization and classification of species assessed by chaos game representation of sequences,” Molecular Biology and Evolution, vol. 16, no. 10, pp. 1391–1399, 1999.
  • M. Lynch, “The frailty of adaptive hypotheses for the origins of organismal complexity,” Proceedings of the National Academy of Sciences of the United States of America, vol. 104, no. 1, pp. 8597–8604, 2007.
  • G. Albrecht-Buehler, “Asymptotically increasing compliance of genomes with Chargaff's second parity rules through inversions and inverted transpositions,” Proceedings of the National Academy of Sci-ences of the United States of America, vol. 103, no. 47, pp. 17828–17833, 2006.
  • D. Mitchell and R. Bridge, “A test of Chargaff's second rule,” Biochemical and Biophysical Research Communications, vol. 340, no. 1, pp. 90–94, 2006.
  • B. R. Powdel, S. S. Satapathy, A. Kumar et al., “A study in entire chromosomes of violations of the intra-strand parity of complementary nucleotides (Chargaff's Second Parity Rule),” DNA Research, vol. 16, no. 6, pp. 325–343, 2009.
  • C. T. Zhang, R. Zhang, and H. Y. Ou, “The Z curve database: a graphic representation of genome sequences,” Bioinformatics, vol. 19, no. 5, pp. 593–599, 2003.
  • P. Bak, K. Chen, and C. Tang, “A forest-fire model and some thoughts on turbulence,” Physics Letters A, vol. 147, no. 5-6, pp. 297–300, 1990.
  • N. E. Israeloff, M. Kagalenko, and K. Chan, “Can Zipf distinguish language from noise in noncoding DNA?” Physical Review Letters, vol. 76, pp. 1976–1979, 1995.
  • R. N. Mantegna and H. E. Stanley, “Scaling behaviour in the dynamics of an economic index,” Nature, vol. 376, no. 6535, pp. 46–49, 1995.
  • L. A. Adamic and B. A. Huberman, “Zipfs law and the Internet,” Glottometrics, vol. 3, pp. 143–150, 2002.
  • H. Aoyama, Y. Fujiwara, and W. Souma, “Kinematics and dynamics of pareto-zipf's law and gibrat's law,” Physica A, vol. 344, no. 1-2, pp. 117–121, 2004.
  • C. Andersson, A. Hellervik, and K. Lindgren, “A spatial network explanation for a hierarchy of urban power laws,” Physica A, vol. 345, no. 1-2, pp. 227–244, 2005.
  • A. L. Barabási, “The origin of bursts and heavy tails in human dynamics,” Nature, vol. 435, no. 7039, pp. 207–211, 2005.
  • W. Dahui, L. Menghui, and D. Zengru, “True reason for Zipf's law in language,” Physica A, vol. 358, no. 2–4, pp. 545–550, 2005.
  • J. M. Sarabia and F. Prieto, “The Pareto-positive stable distribution: a new descriptive model for city size data,” Physica A, vol. 388, no. 19, pp. 4179–4191, 2009.
  • T. Fenner, M. Levene, and G. Loizou, “Predicting the long tail of book sales: unearthing the power-law exponent,” Physica A, vol. 389, no. 12, pp. 2416–2421, 2010.
  • J. A. T. Machado, A. C. Costa, and M. D. Quelhas, “Shannon, Rényie and Tsallis entropy analysis of DNA using phase plane,” Nonlinear Analysis: Real World Applications, vol. 12, no. 6, pp. 3135–3144, 2011.
  • J. A. T. Machado and S. Entropy, “Analysis of the Genome Code,” Mathematical Problems in Engineer-ing, vol. 2012, Article ID 132625, 12 pages, 2012.
  • J. T. Machado, “Accessing complexity from genome information,” Communications in Nonlinear Science and Numerical Simulations, vol. 17, no. 6, pp. 2237–2243, 2012.
  • R. Hilfer, Applications of Fractional Calculus in Physics, World Scientific, Singapore, 2000.
  • D. Baleanu and S. I. Vacaru, “Fractional curve flows and solitonic hierarchies in gravity and geometric mechanics,” Journal of Mathematical Physics, vol. 52, no. 5, Article ID 053514, 15 pages, 2011.
  • D. Baleanu, K. Diethelm, E. Scalas, and J. J. Trujillo, Fractional Calculus Models and Numerical Methods, vol. 3 of Complexity, Nonlinearity and Chaos, World Scientific Publishing, 2012.
  • C. E. Shannon, “A mathematical theory of communication,” The Bell System Technical Journal, vol. 27, pp. 379–423, 1948.
  • E. T. Jaynes, “Information Theory and Statistical Mechanics,” vol. 106, pp. 620–630, 1957.
  • A. I. Khinchin, Mathematical foundations of information theory, Dover Publications, New York, NY, USA, 1957.
  • A. Plastino and A. R. Plastino, “Tsallis Entropy and Jaynes' information theory formalism,” Brazilian Journal of Physics, vol. 29, no. 1, pp. 50–60, 1999.
  • H. J. Haubold, A. M. Mathai, and R. K. Saxena, “Boltzmann-Gibbs entropy versus Tsallis entropy: recent contributions to resolving the argument of Einstein concerning “neither Herr Boltzmann nor Herr Planck has given a definition of W”? Essay review,” Astrophysics and Space Science, vol. 290, no. 3-4, pp. 241–245, 2004.
  • A. M. Mathai and H. J. Haubold, “Pathway model, superstatistics, Tsallis statistics, and a generalized measure of entropy,” Physica A, vol. 375, no. 1, pp. 110–122, 2007.
  • T. Carter, An Introduction to Information Theory and Entropy, Complex Systems Summer School, Santa Fe, Mexico, 2007.
  • P. N. Rathie and S. Da Silva, “Shannon, Lévy, and Tsallis: a note,” Applied Mathematical Sciences, vol. 2, no. 25–28, pp. 1359–1363, 2008.
  • C. Beck, “Generalised information and entropy measures in physics,” Contemporary Physics, vol. 50, no. 4, pp. 495–510, 2009.
  • I. J. Taneja, “On measures of information and inaccuracy,” Journal of Statistical Physics, vol. 14, no. 3, pp. 263–270, 1976.
  • B. D. Sharma and I. J. Taneja, “Three generalized-additive measures of entropy,” Elektronische Informa-tionsverarbeitung und Kybernetik, vol. 13, no. 7-8, pp. 419–433, 1977.
  • A. Wehrl, “General properties of entropy,” Reviews of Modern Physics, vol. 50, no. 2, pp. 221–260, 1978.
  • H. D. Chen, C. H. Chang, L. C. Hsieh, and H. C. Lee, “Divergence and Shannon information in geno-mes,” Physical Review Letters, vol. 94, no. 17, Article ID 178103, 2005.
  • R. M. Gray, Entropy and Information Theory, Springer, New York, NY, USA, 1990.
  • M. R. Ubriaco, “Entropies based on fractional calculus,” Physics Letters A, vol. 373, no. 30, pp. 2516–2519, 2009.