Jin Shuilin, Tan Renjie, Jiang Qinghua, Xu Li, Peng Jiajie, Wang Yong, Wang Yadong
Department of Mathematics, Harbin Institute of Technology, Harbin, Heilongjiang, China.
School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China.
PLoS One. 2014 Feb 12;9(2):e88519. doi: 10.1371/journal.pone.0088519. eCollection 2014.
Topological entropy is one of the most difficult entropies to be used to analyze the DNA sequences, due to the finite sample and high-dimensionality problems. In order to overcome these problems, a generalized topological entropy is introduced. The relationship between the topological entropy and the generalized topological entropy is compared, which shows the topological entropy is a special case of the generalized entropy. As an application the generalized topological entropy in introns, exons and promoter regions was computed, respectively. The results indicate that the entropy of introns is higher than that of exons, and the entropy of the exons is higher than that of the promoter regions for each chromosome, which suggest that DNA sequence of the promoter regions is more regular than the exons and introns.
由于有限样本和高维问题,拓扑熵是用于分析DNA序列的最难的熵之一。为了克服这些问题,引入了广义拓扑熵。比较了拓扑熵与广义拓扑熵之间的关系,结果表明拓扑熵是广义熵的一种特殊情况。作为应用,分别计算了内含子、外显子和启动子区域的广义拓扑熵。结果表明,对于每条染色体,内含子的熵高于外显子的熵,外显子的熵高于启动子区域的熵,这表明启动子区域的DNA序列比外显子和内含子更规则。