Suppr超能文献

基于新型组代表向量的蛋白质序列统计相似/相异分析。

A Statistical Similarity/Dissimilarity Analysis of Protein Sequences Based on a Novel Group Representative Vector.

机构信息

Department of Engineering Mathematics and Physics, Faculty of Engineering, Mansoura University, Mansoura 35516, Egypt.

出版信息

Biomed Res Int. 2019 May 8;2019:8702968. doi: 10.1155/2019/8702968. eCollection 2019.

Abstract

Similarity/dissimilarity analysis is a key way of understanding the biology of an organism by knowing the origin of the new genes/sequences. Sequence data are grouped in terms of biological relationships. The number of sequences related to any group is susceptible to be increased every day. All the present alignment-free methods approve the utility of their approaches by producing a similarity/dissimilarity matrix. Although this matrix is clear, it measures the degree of similarity among sequences individually. In our work, a representative of each of three groups of protein sequences is introduced. A similarity/dissimilarity vector is evaluated instead of the ordinary similarity/dissimilarity matrix based on the group representative. The approach is applied on three selected groups of protein sequences: beta globin, NADH dehydrogenase subunit 5 (ND5), and spike protein sequences. A cross-grouping comparison is produced to ensure the singularity of each group. A qualitative comparison between our approach, previous articles, and the phylogenetic tree of these protein sequences proved the utility of our approach.

摘要

相似性/相异性分析是通过了解新基因/序列的起源来理解生物生物学的一种关键方法。序列数据根据生物关系进行分组。与任何组相关的序列数量每天都有可能增加。所有现有的无比对方法都通过生成相似性/相异性矩阵来证明其方法的实用性。虽然这个矩阵很清晰,但它单独测量了序列之间的相似程度。在我们的工作中,引入了每组蛋白质序列的代表。基于组代表,评估了相似性/相异性向量,而不是普通的相似性/相异性矩阵。该方法应用于三组选定的蛋白质序列:β球蛋白、NADH 脱氢酶亚基 5(ND5)和刺突蛋白序列。进行了跨组比较,以确保每个组的独特性。我们的方法、以前的文章以及这些蛋白质序列的系统发育树之间的定性比较证明了我们方法的实用性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8853/6530227/22649f0f9114/BMRI2019-8702968.001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验