Suppr超能文献

基于三角函数的DNA序列图形表示与相似性分析

Graphical Representation and Similarity Analysis of DNA Sequences Based on Trigonometric Functions.

作者信息

Xie Guo-Sen, Jin Xiao-Bo, Yang Chunlei, Pu Jiexin, Mo Zhongxi

机构信息

Information Engineering College, Henan University of Science and Technology, Luoyang, 471023, China.

Henan Joint International Research Laboratory of Image Processing and Intelligent Detection, Henan University of Science and Technology, Luoyang, 471023, China.

出版信息

Acta Biotheor. 2018 Jun;66(2):113-133. doi: 10.1007/s10441-018-9324-0. Epub 2018 Apr 19.

Abstract

In this paper, we propose two four-base related 2D curves of DNA primary sequences (termed as F-B curves) and their corresponding single-base related 2D curves (termed as A-related, G-related, T-related and C-related curves). The constructions of these graphical curves are based on the assignments of individual base to four different sinusoidal (or tangent) functions; then by connecting all these points on these four sinusoidal (tangent) functions, we can get the F-B curves; similarly, by connecting the points on each of the four sinusoidal (tangent) functions, we get the single-base related 2D curves. The proposed 2D curves are all strictly non degenerate. Then, a 8-component characteristic vector is constructed to compare similarity among DNA sequences from different species based on a normalized geometrical centers of the proposed curves. As examples, we examine similarity among the coding sequences of the first exon of beta-globin gene from eleven species, similarity of cDNA sequences of beta-globin gene from eight species, and similarity of the whole mitochondrial genomes of 18 eutherian mammals. The experimental results well demonstrate the effectiveness of the proposed method.

摘要

在本文中,我们提出了两种与DNA一级序列的四个碱基相关的二维曲线(称为F - B曲线)以及它们相应的与单个碱基相关的二维曲线(称为A相关、G相关、T相关和C相关曲线)。这些图形曲线的构建基于将单个碱基分配给四个不同的正弦(或正切)函数;然后通过连接这四个正弦(正切)函数上的所有这些点,我们可以得到F - B曲线;类似地,通过连接四个正弦(正切)函数中每个函数上的点,我们得到与单个碱基相关的二维曲线。所提出的二维曲线都是严格非退化的。然后,基于所提出曲线的归一化几何中心构建一个8分量特征向量,以比较不同物种DNA序列之间的相似性。作为示例,我们研究了来自11个物种的β - 珠蛋白基因第一个外显子编码序列之间的相似性、来自8个物种的β - 珠蛋白基因cDNA序列的相似性以及18种真兽类哺乳动物整个线粒体基因组的相似性。实验结果很好地证明了所提出方法的有效性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验