Suppr超能文献

远缘相关蛋白质的结构依赖性序列比对。

Structure-dependent sequence alignment for remotely related proteins.

作者信息

Yang An-Suei

机构信息

Department of Pharmacology and Columbia Genome Center, Columbia University, 630 West 168th street, PH 7 W Room 318, New York, NY 10032, USA.

出版信息

Bioinformatics. 2002 Dec;18(12):1658-65. doi: 10.1093/bioinformatics/18.12.1658.

Abstract

MOTIVATION

The quality of a model structure derived from a comparative modeling procedure is dictated by the accuracy of the predicted sequence-template alignment. As the sequence-template pairs are increasingly remote in sequence relationship, the prediction of the sequence-template alignments becomes increasingly problematic with sequence alignment methods. Structural information of the template, used in connection with the sequence relationship of the sequence-template pair, could significantly improve the accuracy of the sequence-template alignment. In this paper, we describe a sequence-template alignment method that integrates sequence and structural information to enhance the accuracy of sequence-template alignments for distantly related protein pairs.

RESULTS

The structure-dependent sequence alignment (SDSA) procedure was optimized for coverage and accuracy on a training set of 412 protein pairs; the structures for each of the training pairs are similar (RMSD< approximately 4A) but the sequence relationship is undetectable (average pair-wise sequence identity = 8%). The optimized SDSA procedure was then applied to extend PSI-BLAST local alignments by calculating the global alignments under the constraint of the residue pairs in the local alignments. This composite alignment procedure was assessed with a testing set of 1421 protein pairs, of which the pair-wise structures are similar (RMSD< approximately 4A) but the sequences are marginally related at best in each pair (average pair-wise sequence identity = 13%). The assessment showed that the composite alignment procedure predicted more aligned residues pairs with an average of 27% increase in correctly aligned residues over the standard PSI-BLAST alignments for the protein pairs in the testing set.

摘要

动机

通过比较建模程序获得的模型结构质量取决于预测的序列-模板比对的准确性。随着序列-模板对在序列关系上越来越远,使用序列比对方法预测序列-模板比对变得越来越困难。结合序列-模板对的序列关系使用的模板结构信息,可以显著提高序列-模板比对的准确性。在本文中,我们描述了一种序列-模板比对方法,该方法整合了序列和结构信息,以提高远缘相关蛋白质对的序列-模板比对的准确性。

结果

在一个由412个蛋白质对组成的训练集上,对依赖结构的序列比对(SDSA)程序进行了覆盖范围和准确性方面的优化;每个训练对的结构相似(均方根偏差<约4埃),但序列关系不可检测(平均成对序列同一性 = 8%)。然后应用优化后的SDSA程序,通过在局部比对中的残基对约束下计算全局比对来扩展PSI-BLAST局部比对。使用一个由1421个蛋白质对组成的测试集对这种复合比对程序进行了评估,其中每对的成对结构相似(均方根偏差<约4埃),但每对中的序列充其量只是略微相关(平均成对序列同一性 = 13%)。评估表明,对于测试集中的蛋白质对,复合比对程序预测的比对残基对比标准PSI-BLAST比对平均多27%的正确比对残基。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验