Xie Guo-Sen, Jin Xiao-Bo, Yang Chunlei, Pu Jiexin, Mo Zhongxi
Information Engineering College, Henan University of Science and Technology, Luoyang, 471023, China.
Henan Joint International Research Laboratory of Image Processing and Intelligent Detection, Henan University of Science and Technology, Luoyang, 471023, China.
Acta Biotheor. 2018 Jun;66(2):113-133. doi: 10.1007/s10441-018-9324-0. Epub 2018 Apr 19.
In this paper, we propose two four-base related 2D curves of DNA primary sequences (termed as F-B curves) and their corresponding single-base related 2D curves (termed as A-related, G-related, T-related and C-related curves). The constructions of these graphical curves are based on the assignments of individual base to four different sinusoidal (or tangent) functions; then by connecting all these points on these four sinusoidal (tangent) functions, we can get the F-B curves; similarly, by connecting the points on each of the four sinusoidal (tangent) functions, we get the single-base related 2D curves. The proposed 2D curves are all strictly non degenerate. Then, a 8-component characteristic vector is constructed to compare similarity among DNA sequences from different species based on a normalized geometrical centers of the proposed curves. As examples, we examine similarity among the coding sequences of the first exon of beta-globin gene from eleven species, similarity of cDNA sequences of beta-globin gene from eight species, and similarity of the whole mitochondrial genomes of 18 eutherian mammals. The experimental results well demonstrate the effectiveness of the proposed method.
在本文中,我们提出了两种与DNA一级序列的四个碱基相关的二维曲线(称为F - B曲线)以及它们相应的与单个碱基相关的二维曲线(称为A相关、G相关、T相关和C相关曲线)。这些图形曲线的构建基于将单个碱基分配给四个不同的正弦(或正切)函数;然后通过连接这四个正弦(正切)函数上的所有这些点,我们可以得到F - B曲线;类似地,通过连接四个正弦(正切)函数中每个函数上的点,我们得到与单个碱基相关的二维曲线。所提出的二维曲线都是严格非退化的。然后,基于所提出曲线的归一化几何中心构建一个8分量特征向量,以比较不同物种DNA序列之间的相似性。作为示例,我们研究了来自11个物种的β - 珠蛋白基因第一个外显子编码序列之间的相似性、来自8个物种的β - 珠蛋白基因cDNA序列的相似性以及18种真兽类哺乳动物整个线粒体基因组的相似性。实验结果很好地证明了所提出方法的有效性。