使用诱饵依赖型判别函数改进蛋白质结构选择

Improved protein structure selection using decoy-dependent discriminatory functions.

作者信息

Wang Kai, Fain Boris, Levitt Michael, Samudrala Ram

机构信息

Department of Microbiology, University of Washington School of Medicine, Seattle, WA 98195, USA.

出版信息

BMC Struct Biol. 2004 Jun 18;4:8. doi: 10.1186/1472-6807-4-8.

DOI:10.1186/1472-6807-4-8

PMID:15207004

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC449718/

Abstract

BACKGROUND

A key component in protein structure prediction is a scoring or discriminatory function that can distinguish near-native conformations from misfolded ones. Various types of scoring functions have been developed to accomplish this goal, but their performance is not adequate to solve the structure selection problem. In addition, there is poor correlation between the scores and the accuracy of the generated conformations.

RESULTS

We present a simple and nonparametric formula to estimate the accuracy of predicted conformations (or decoys). This scoring function, called the density score function, evaluates decoy conformations by performing an all-against-all Calpha RMSD (Root Mean Square Deviation) calculation in a given decoy set. We tested the density score function on 83 decoy sets grouped by their generation methods (4state_reduced, fisa, fisa_casp3, lmds, lattice_ssfit, semfold and Rosetta). The density scores have correlations as high as 0.9 with the Calpha RMSDs of the decoy conformations, measured relative to the experimental conformation for each decoy. We previously developed a residue-specific all-atom probability discriminatory function (RAPDF), which compiles statistics from a database of experimentally determined conformations, to aid in structure selection. Here, we present a decoy-dependent discriminatory function called self-RAPDF, where we compiled the atom-atom contact probabilities from all the conformations in a decoy set instead of using an ensemble of native conformations, with a weighting scheme based on the density scores. The self-RAPDF has a higher correlation with Calpha RMSD than RAPDF for 76/83 decoy sets, and selects better near-native conformations for 62/83 decoy sets. Self-RAPDF may be useful not only for selecting near-native conformations from decoy sets, but also for fold simulations and protein structure refinement.

CONCLUSIONS

Both the density score and the self-RAPDF functions are decoy-dependent scoring functions for improved protein structure selection. Their success indicates that information from the ensemble of decoy conformations can be used to derive statistical probabilities and facilitate the identification of near-native structures.

摘要

背景

蛋白质结构预测中的一个关键组成部分是评分或判别函数，它能够区分接近天然的构象与错误折叠的构象。为实现这一目标，已开发出各种类型的评分函数，但它们的性能不足以解决结构选择问题。此外，评分与生成构象的准确性之间的相关性较差。

结果

我们提出了一个简单的非参数公式来估计预测构象（或诱饵构象）的准确性。这个评分函数称为密度评分函数，通过在给定的诱饵构象集中进行所有对所有的Cα均方根偏差（Root Mean Square Deviation，RMSD）计算来评估诱饵构象。我们在按生成方法分组的83个诱饵构象集（4state_reduced、fisa、fisa_casp3、lmds、lattice_ssfit、semfold和Rosetta）上测试了密度评分函数。相对于每个诱饵构象的实验构象，密度评分与诱饵构象的Cα RMSD的相关性高达0.9。我们之前开发了一种残基特异性全原子概率判别函数（RAPDF），它从实验确定的构象数据库中收集统计数据，以辅助结构选择。在此，我们提出了一种依赖于诱饵构象的判别函数，称为自RAPDF，其中我们从一个诱饵构象集中的所有构象编译原子-原子接触概率，而不是使用天然构象的集合，并采用基于密度评分的加权方案。对于83个诱饵构象集中的76个，自RAPDF与Cα RMSD的相关性高于RAPDF，并且对于83个诱饵构象集中的62个，它能选择更好的接近天然的构象。自RAPDF不仅可用于从诱饵构象集中选择接近天然的构象，还可用于折叠模拟和蛋白质结构优化。

结论

密度评分函数和自RAPDF函数都是依赖于诱饵构象的评分函数，用于改进蛋白质结构选择。它们的成功表明，来自诱饵构象集合的信息可用于推导统计概率并有助于识别接近天然的结构。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a77/449718/d04702b5a973/1472-6807-4-8-1.jpg

相似文献

Improved protein structure selection using decoy-dependent discriminatory functions.

BMC Struct Biol. 2004 Jun 18;4:8. doi: 10.1186/1472-6807-4-8.

Protein structure prediction by all-atom free-energy refinement.

BMC Struct Biol. 2007 Mar 19;7:12. doi: 10.1186/1472-6807-7-12.

The effect of experimental resolution on the performance of knowledge-based discriminatory functions for protein structure selection.

Protein Eng Des Sel. 2006 Sep;19(9):431-7. doi: 10.1093/protein/gzl027. Epub 2006 Jul 14.

How well can we predict native contacts in proteins based on decoy structures and their energies?

Proteins. 2003 Sep 1;52(4):598-608. doi: 10.1002/prot.10444.

Ab initio construction of polypeptide fragments: Accuracy of loop decoy discrimination by an all-atom statistical potential and the AMBER force field with the Generalized Born solvation model.

Proteins. 2003 Apr 1;51(1):21-40. doi: 10.1002/prot.10235.

Distinguishing native conformations of proteins from decoys with an effective free energy estimator based on the OPLS all-atom force field and the Surface Generalized Born solvent model.

Proteins. 2002 Aug 1;48(2):404-22. doi: 10.1002/prot.10171.

Discrimination of native loop conformations in membrane proteins: decoy library design and evaluation of effective energy scoring functions.

Proteins. 2003 Sep 1;52(4):492-509. doi: 10.1002/prot.10404.

sDFIRE: Sequence-specific statistical energy function for protein structure prediction by decoy selections.

J Comput Chem. 2016 May 5;37(12):1119-24. doi: 10.1002/jcc.24298. Epub 2016 Feb 5.

A decoy set for the thermostable subdomain from chicken villin headpiece, comparison of different free energy estimators.

BMC Bioinformatics. 2005 Dec 14;6:301. doi: 10.1186/1471-2105-6-301.

Artefacts and biases affecting the evaluation of scoring functions on decoy sets for protein structure prediction.

Bioinformatics. 2009 May 15;25(10):1271-9. doi: 10.1093/bioinformatics/btp150. Epub 2009 Mar 17.

引用本文的文献

Unsupervised and Supervised Learning over theEnergy Landscape for Protein Decoy Selection.

Biomolecules. 2019 Oct 14;9(10):607. doi: 10.3390/biom9100607.

Computational protein structure refinement: Almost there, yet still so far to go.

Wiley Interdiscip Rev Comput Mol Sci. 2017 May-Jun;7(3). doi: 10.1002/wcms.1307. Epub 2017 Mar 28.

Exploring Polypharmacology in Drug Discovery and Repurposing Using the CANDO Platform.

Curr Pharm Des. 2016;22(21):3109-23. doi: 10.2174/1381612822666160325121943.

Detecting local residue environment similarity for recognizing near-native structure models.

Proteins. 2014 Dec;82(12):3255-72. doi: 10.1002/prot.24658. Epub 2014 Oct 30.

How good are simplified models for protein structure prediction?

Adv Bioinformatics. 2014;2014:867179. doi: 10.1155/2014/867179. Epub 2014 Apr 29.

Optimized atomic statistical potentials: assessment of protein interfaces and loops.

Bioinformatics. 2013 Dec 15;29(24):3158-66. doi: 10.1093/bioinformatics/btt560. Epub 2013 Sep 27.

Membrane protein orientation and refinement using a knowledge-based statistical potential.

BMC Bioinformatics. 2013 Sep 18;14:276. doi: 10.1186/1471-2105-14-276.

Statistical potential for modeling and ranking of protein-ligand interactions.

J Chem Inf Model. 2011 Dec 27;51(12):3078-92. doi: 10.1021/ci200377u. Epub 2011 Nov 21.

QMEANclust: estimation of protein model quality by combining a composite scoring function with structural density information.

BMC Struct Biol. 2009 May 20;9:35. doi: 10.1186/1472-6807-9-35.

Artefacts and biases affecting the evaluation of scoring functions on decoy sets for protein structure prediction.

Bioinformatics. 2009 May 15;25(10):1271-9. doi: 10.1093/bioinformatics/btp150. Epub 2009 Mar 17.

本文引用的文献

The OPLS [optimized potentials for liquid simulations] potential functions for proteins, energy minimizations for crystals of cyclic peptides and crambin.

J Am Chem Soc. 1988 Mar 1;110(6):1657-66. doi: 10.1021/ja00214a001.

Combining local-structure, fold-recognition, and new fold methods for protein structure prediction.

Proteins. 2003;53 Suppl 6:491-6. doi: 10.1002/prot.10540.

Prediction of protein structure by emphasizing local side-chain/backbone interactions in ensembles of turn fragments.

Proteins. 2003;53 Suppl 6:486-90. doi: 10.1002/prot.10541.

Assembling novel protein folds from super-secondary structural fragments.

Proteins. 2003;53 Suppl 6:480-5. doi: 10.1002/prot.10542.

TOUCHSTONE: a unified approach to protein structure prediction.

Proteins. 2003;53 Suppl 6:469-79. doi: 10.1002/prot.10551.

Rosetta predictions in CASP5: successes, failures, and prospects for complete automation.

Proteins. 2003;53 Suppl 6:457-68. doi: 10.1002/prot.10552.

An improved protein decoy set for testing energy functions for protein structure prediction.

Proteins. 2003 Oct 1;53(1):76-87. doi: 10.1002/prot.10454.

How well can we predict native contacts in proteins based on decoy structures and their energies?

Proteins. 2003 Sep 1;52(4):598-608. doi: 10.1002/prot.10444.

Amino acid empirical contact energy definitions for fold recognition in the space of contact maps.

BMC Bioinformatics. 2003 Feb 28;4:8. doi: 10.1186/1471-2105-4-8.

Discrimination of native protein structures using atom-atom contact scoring.

Proc Natl Acad Sci U S A. 2003 Mar 18;100(6):3215-20. doi: 10.1073/pnas.0535768100. Epub 2003 Mar 11.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用诱饵依赖型判别函数改进蛋白质结构选择

Improved protein structure selection using decoy-dependent discriminatory functions.

作者信息

Wang Kai, Fain Boris, Levitt Michael, Samudrala Ram

机构信息

Department of Microbiology, University of Washington School of Medicine, Seattle, WA 98195, USA.