Sadovsky M G
Institute of Biophysics of Siberian Division of Russian Academy of Sciences, Akademgorodok, Krasnoyarsk, 660036, Russia.
Bull Math Biol. 2006 May;68(4):785-806. doi: 10.1007/s11538-005-9017-0. Epub 2006 Apr 7.
The information capacity of nucleotide sequences is defined through the specific entropy of frequency dictionary of a sequence determined with respect to another one containing the most probable continuations of shorter strings. This measure distinguishes a sequence both from a random one, and from ordered entity. A comparison of sequences based on their information capacity is studied. An order within the genetic entities is found at the length scale ranged from 3 to 8. Some other applications of the developed methodology to genetics, bioinformatics, and molecular biology are discussed.