College of Science, Dalian Jiaotong University, Dalian 116028, China.
Math Biosci. 2013 Nov;246(1):8-13. doi: 10.1016/j.mbs.2013.09.004. Epub 2013 Sep 20.
The degree of similarity of DNA sequences can be concluded according to the comparison of DNA sequences, which helps to speculate their relationship in respect of the structure, function and evolution. In this paper, we introduce the fundamental of the weighted relative entropy based on 2-step Markov Model to compare DNA sequences. The DNA sequence, consisted of four characters A, T, C, G, can be considered as a Markov chain. By taking state space I={A, T, C, G} and describe the DNA sequences with 2-step transition probability matrix we can get the eigenvalue of the DNA sequence to define the similarity metric. Therefore, we find a new method to compare the DNA sequences, which is used to classify chromosomes DNA sequences obtained from 30 species. The phylogenetic tree built by the alignment-free method of the distance matrix resulted from the weighted relative entropy has clearer and more accurate division.
根据 DNA 序列的比较,可以得出 DNA 序列的相似程度,这有助于推测它们在结构、功能和进化方面的关系。在本文中,我们介绍了基于 2 步马尔可夫模型的加权相对熵的基本原理,以比较 DNA 序列。由四个字符 A、T、C、G 组成的 DNA 序列可以被视为一个马尔可夫链。通过取状态空间 I={A, T, C, G},并使用 2 步转移概率矩阵来描述 DNA 序列,我们可以得到 DNA 序列的本征值,从而定义相似性度量。因此,我们找到了一种新的方法来比较 DNA 序列,并用它来对来自 30 个物种的染色体 DNA 序列进行分类。通过无比对方法构建的基于距离矩阵的系统发育树,其基于加权相对熵的方法得到了更清晰、更准确的划分。