• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从蛋白质结构比对分析到一种比对蛋白质序列的新方法。

From analysis of protein structural alignments toward a novel approach to align protein sequences.

作者信息

Sunyaev Shamil R, Bogopolsky Gennady A, Oleynikova Natalia V, Vlasov Peter K, Finkelstein Alexei V, Roytberg Mikhail A

机构信息

Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia.

出版信息

Proteins. 2004 Feb 15;54(3):569-82. doi: 10.1002/prot.10503.

DOI:10.1002/prot.10503
PMID:14748004
Abstract

Alignment of protein sequences is a key step in most computational methods for prediction of protein function and homology-based modeling of three-dimensional (3D)-structure. We investigated correspondence between "gold standard" alignments of 3D protein structures and the sequence alignments produced by the Smith-Waterman algorithm, currently the most sensitive method for pair-wise alignment of sequences. The results of this analysis enabled development of a novel method to align a pair of protein sequences. The comparison of the Smith-Waterman and structure alignments focused on their inner structure and especially on the continuous ungapped alignment segments, "islands" between gaps. Approximately one third of the islands in the gold standard alignments have negative or low positive score, and their recognition is below the sensitivity limit of the Smith-Waterman algorithm. From the alignment accuracy perspective, the time spent by the algorithm while working in these unalignable regions is unnecessary. We considered features of the standard similarity scoring function responsible for this phenomenon and suggested an alternative hierarchical algorithm, which explicitly addresses high scoring regions. This algorithm is considerably faster than the Smith-Waterman algorithm, whereas resulting alignments are in average of the same quality with respect to the gold standard. This finding shows that the decrease of alignment accuracy is not necessarily a price for the computational efficiency.

摘要

蛋白质序列比对是大多数用于预测蛋白质功能以及基于同源性的三维(3D)结构建模的计算方法中的关键步骤。我们研究了3D蛋白质结构的“金标准”比对与Smith-Waterman算法产生的序列比对之间的对应关系,Smith-Waterman算法是目前用于成对序列比对最灵敏的方法。该分析结果促成了一种用于比对一对蛋白质序列的新方法的开发。对Smith-Waterman比对和结构比对的比较聚焦于它们的内部结构,尤其是连续的无间隙比对片段,即间隙之间的“岛”。在金标准比对中,约三分之一的“岛”得分呈负或低正,且对它们的识别低于Smith-Waterman算法的灵敏度极限。从比对准确性的角度来看,算法在这些无法比对区域工作所花费的时间是不必要的。我们考虑了导致这种现象的标准相似性评分函数的特征,并提出了一种替代的分层算法,该算法明确针对高得分区域。该算法比Smith-Waterman算法快得多,而产生的比对相对于金标准而言平均质量相同。这一发现表明,比对准确性的降低不一定是以计算效率为代价的。

相似文献

1
From analysis of protein structural alignments toward a novel approach to align protein sequences.从蛋白质结构比对分析到一种比对蛋白质序列的新方法。
Proteins. 2004 Feb 15;54(3):569-82. doi: 10.1002/prot.10503.
2
FAST: a novel protein structure alignment algorithm.FAST:一种新型蛋白质结构比对算法。
Proteins. 2005 Feb 15;58(3):618-27. doi: 10.1002/prot.20331.
3
[Information about the protein secondary structure improves quality of an alignment of protein sequences].[关于蛋白质二级结构的信息可提高蛋白质序列比对的质量]
Mol Biol (Mosk). 2006 May-Jun;40(3):533-40.
4
Structure-based evaluation of sequence comparison and fold recognition alignment accuracy.基于结构的序列比对和折叠识别比对准确性评估。
J Mol Biol. 2000 Apr 7;297(4):1003-13. doi: 10.1006/jmbi.2000.3615.
5
Homology-based modeling of 3D structures of protein-protein complexes using alignments of modified sequence profiles.利用修饰序列谱比对进行蛋白质-蛋白质复合物三维结构的基于同源性的建模。
Int J Biol Macromol. 2008 Aug 15;43(2):198-208. doi: 10.1016/j.ijbiomac.2008.05.004. Epub 2008 May 21.
6
T-Coffee: A novel method for fast and accurate multiple sequence alignment.T-Coffee:一种用于快速准确的多序列比对的新方法。
J Mol Biol. 2000 Sep 8;302(1):205-17. doi: 10.1006/jmbi.2000.4042.
7
NdPASA: a novel pairwise protein sequence alignment algorithm that incorporates neighbor-dependent amino acid propensities.NdPASA:一种整合了邻域依赖氨基酸倾向的新型双序列蛋白质序列比对算法。
Proteins. 2005 Feb 15;58(3):628-37. doi: 10.1002/prot.20359.
8
[Increasing the accuracy of the global alignment of amino acid sequences by constructing a set of alignment candidates].[通过构建一组比对候选序列提高氨基酸序列全局比对的准确性]
Biofizika. 2010 Nov-Dec;55(6):965-75.
9
Analysis and prediction of functional sub-types from protein sequence alignments.基于蛋白质序列比对的功能亚类型分析与预测。
J Mol Biol. 2000 Oct 13;303(1):61-76. doi: 10.1006/jmbi.2000.4036.
10
SnapDRAGON: a method to delineate protein structural domains from sequence data.SnapDRAGON:一种从序列数据中描绘蛋白质结构域的方法。
J Mol Biol. 2002 Feb 22;316(3):839-51. doi: 10.1006/jmbi.2001.5387.

引用本文的文献

1
The ranging of amino acids substitution matrices of various types in accordance with the alignment accuracy criterion.根据比对准确性标准对各种类型氨基酸替换矩阵进行排序。
BMC Bioinformatics. 2020 Sep 14;21(Suppl 11):294. doi: 10.1186/s12859-020-03616-0.
2
Comparative analysis of the quality of a global algorithm and a local algorithm for alignment of two sequences.两种序列比对的全局算法和局部算法质量的比较分析。
Algorithms Mol Biol. 2011 Oct 27;6(1):25. doi: 10.1186/1748-7188-6-25.
3
Splitting the BLOSUM score into numbers of biological significance.
将BLOSUM评分分解为具有生物学意义的数值。
EURASIP J Bioinform Syst Biol. 2007;2007(1):31450. doi: 10.1155/2007/31450.