Suppr超能文献

标准化特征向量:一种新颖的基于相邻氨基酸数量的无比对序列比较方法。

Normalized feature vectors: a novel alignment-free sequence comparison method based on the numbers of adjacent amino acids.

机构信息

School of Electronics and Information Engineering, Tongji University, 4800 Caoan Road, Shanghai 201804, China.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2013 Mar-Apr;10(2):457-67. doi: 10.1109/TCBB.2013.10.

Abstract

Based on all kinds of adjacent amino acids (AAA), we map each protein primary sequence into a 400 by ((L-1)) matrix (M). In addition, we further derive a normalized 400-tuple mathematical descriptors (D), which is extracted from the primary protein sequences via singular values decomposition (SVD) of the matrix. The obtained 400-D normalized feature vectors (NFVs) further facilitate our quantitative analysis of protein sequences. Using the normalized representation of the primary protein sequences, we analyze the similarity for different sequences upon two data sets: 1) ND5 sequences from nine species and 2) transferrin sequences of 24 vertebrates. We also compared the results in this study with those from other related works. These two experiments illustrate that our proposed NFV-AAA approach does perform well in the field of similarity analysis of sequence.

摘要

基于各种相邻氨基酸 (AAA),我们将每个蛋白质的一级序列映射到一个 400 乘以 ((L-1)) 的矩阵 (M)。此外,我们进一步推导出一个归一化的 400 元组数学描述符 (D),它是通过矩阵的奇异值分解 (SVD) 从一级蛋白质序列中提取出来的。获得的 400-D 归一化特征向量 (NFV) 进一步促进了我们对蛋白质序列的定量分析。使用一级蛋白质序列的归一化表示,我们在两个数据集上分析不同序列之间的相似性:1) 来自九个物种的 ND5 序列和 2) 24 种脊椎动物的转铁蛋白序列。我们还将本研究的结果与其他相关工作的结果进行了比较。这两个实验表明,我们提出的 NFV-AAA 方法在序列相似性分析领域表现良好。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验