Suppr超能文献

一种基于谱半径的蛋白质序列相似性分析新模型。

A novel model for protein sequence similarity analysis based on spectral radius.

作者信息

Wu Chuanyan, Gao Rui, De Marinis Yang, Zhang Yusen

机构信息

School of Control Science and Engineering, Shandong University, Jinan 250061, China.

School of Control Science and Engineering, Shandong University, Jinan 250061, China.

出版信息

J Theor Biol. 2018 Jun 7;446:61-70. doi: 10.1016/j.jtbi.2018.03.001. Epub 2018 Mar 7.

Abstract

Advances in sequencing technologies led to rapid increase in the number and diversity of biological sequences, which facilitated development in the sequence research. In this paper, we present a new method for analyzing protein sequence similarity. We calculated the spectral radii of 20 amino acids (AAs) and put forward a novel 2-D graphical representation of protein sequences. To characterize protein sequences numerically, three groups of features were extracted and related to statistical, dynamics measurements and fluctuation complexity of the sequences. With the obtained feature vector, two models utilizing Gaussian Kernel similarity and Cosine similarity were built to measure the similarity between sequences. We applied our method to analyze the similarities/dissimilarities of four data sets. Both proposed models received consistent results with improvements when compared to that obtained by the ClustalW analysis. The novel approach we present in this study may therefore benefit protein research in medical and scientific fields.

摘要

测序技术的进步导致生物序列的数量和多样性迅速增加,这推动了序列研究的发展。在本文中,我们提出了一种分析蛋白质序列相似性的新方法。我们计算了20种氨基酸(AA)的谱半径,并提出了一种新颖的蛋白质序列二维图形表示法。为了从数值上表征蛋白质序列,提取了三组特征,并将其与序列的统计、动力学测量和波动复杂性相关联。利用获得的特征向量,建立了两个利用高斯核相似性和余弦相似性的模型来测量序列之间的相似性。我们应用我们的方法分析了四个数据集的相似性/差异性。与通过ClustalW分析获得的结果相比,两个提出的模型都得到了一致的结果且有所改进。因此,我们在本研究中提出的新方法可能有益于医学和科学领域的蛋白质研究。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78b3/7094169/c6996a24406f/fx1_lrg.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验