Suppr超能文献

基于抽样的方法,通过整合多个分数和特征对蛋白质结构模型进行排序。

A sampling-based method for ranking protein structural models by integrating multiple scores and features.

机构信息

College of Computer Science and Technology, Jilin University, Jilin, Changchun 130012, China.

出版信息

Curr Protein Pept Sci. 2011 Sep;12(6):540-8. doi: 10.2174/138920311796957658.

Abstract

One of the major challenges in protein tertiary structure prediction is structure quality assessment. In many cases, protein structure prediction tools generate good structural models, but fail to select the best models from a huge number of candidates as the final output. In this study, we developed a sampling-based machine-learning method to rank protein structural models by integrating multiple scores and features. First, features such as predicted secondary structure, solvent accessibility and residue-residue contact information are integrated by two Radial Basis Function (RBF) models trained from different datasets. Then, the two RBF scores and five selected scoring functions developed by others, i.e., Opus-CA, Opus-PSP, DFIRE, RAPDF, and Cheng Score are synthesized by a sampling method. At last, another integrated RBF model ranks the structural models according to the features of sampling distribution. We tested the proposed method by using two different datasets, including the CASP server prediction models of all CASP8 targets and a set of models generated by our in-house software MUFOLD. The test result shows that our method outperforms any individual scoring function on both best model selection, and overall correlation between the predicted ranking and the actual ranking of structural quality.

摘要

蛋白质三级结构预测中的主要挑战之一是结构质量评估。在许多情况下,蛋白质结构预测工具可以生成良好的结构模型,但无法从大量候选模型中选择最佳模型作为最终输出。在这项研究中,我们开发了一种基于抽样的机器学习方法,通过整合多个评分和特征来对蛋白质结构模型进行排序。首先,通过从不同数据集训练的两个径向基函数 (RBF) 模型来整合预测的二级结构、溶剂可及性和残基-残基接触信息等特征。然后,通过抽样方法将两个 RBF 得分和五个由他人开发的选择评分函数(Opus-CA、Opus-PSP、DFIRE、RAPDF 和 Cheng 得分)进行综合。最后,另一个集成的 RBF 模型根据抽样分布的特征对结构模型进行排序。我们使用两个不同的数据集(包括所有 CASP8 目标的 CASP 服务器预测模型和我们内部软件 MUFOLD 生成的一组模型)来测试所提出的方法。测试结果表明,我们的方法在最佳模型选择和结构质量预测排名与实际排名之间的整体相关性方面,优于任何单个评分函数。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c9a4/4368063/90e5940ecb87/nihms670521f1.jpg

相似文献

7
Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model.基于超深度学习模型的蛋白质接触图从头精确预测
PLoS Comput Biol. 2017 Jan 5;13(1):e1005324. doi: 10.1371/journal.pcbi.1005324. eCollection 2017 Jan.

本文引用的文献

1
Universal Approximation Using Radial-Basis-Function Networks.使用径向基函数网络的通用逼近
Neural Comput. 1991 Summer;3(2):246-257. doi: 10.1162/neco.1991.3.2.246.
6
Quality assessment of protein structure models.蛋白质结构模型的质量评估。
Curr Protein Pept Sci. 2009 Jun;10(3):216-28. doi: 10.2174/138920309788452173.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验