Department of Physics I, Faculty of Applied Sciences, Politehnica University of Bucharest, 313 Splaiul Independentei, RO-060042, Bucharest, Romania.
J Theor Biol. 2010 Dec 21;267(4):513-8. doi: 10.1016/j.jtbi.2010.09.027. Epub 2010 Sep 28.
Using chaos game representation we introduce a novel and straightforward method for identifying similarities/dissimilarities between DNA sequences of the same type, from different organisms. A matrix is associated to each CGR pattern and the similarities result from the comparison between the matrices of the sequences of interest. Three different methods of analysis of the resulting difference matrix are considered: a 3-dimensional representation giving both local and global information, a numerical characterization by defining an n-letter word similarity measure and a statistical evaluation. The method is illustrated by implementation to the study of albumin nucleotides sequences from eight mammal species taking as reference the human albumin.
利用混沌游戏表示法,我们引入了一种新颖而直接的方法,用于识别来自不同生物体的相同类型 DNA 序列之间的相似性/差异性。为每个 CGR 模式关联一个矩阵,并且相似性来自于对感兴趣序列的矩阵的比较。考虑了对生成的差异矩阵进行分析的三种不同方法:提供局部和全局信息的三维表示、通过定义 n 字母单词相似性度量值进行数值特征化以及统计评估。该方法通过实施例进行了说明,研究了来自 8 种哺乳动物的白蛋白核苷酸序列,以人类白蛋白作为参考。