Bugaenko N N, Gorban' A N, Sadovskiĭ M G
Biofizika. 1997 Sep-Oct;42(5):1047-53.
The problem of determining the information content of nucleotide sequences is discussed. Exact expressions for the reconstitution of higher-order frequency dictionaries from lower-order once were obtained by the maximum entropy method. In form, they are analogous to superpositional approximations known in statistical physics. The features of entropy characteristics of real nucleotide sequences are described that reliably distinguish them from random texts. Methods for comparing the information content of frequency dictionaries and assessing the residual uncertainty of the text at the known frequency dictionary are proposed.