Suppr超能文献

基于深度学习的蛋白质三级结构建模和 CASP13 中的接触距离预测。

Protein tertiary structure modeling driven by deep learning and contact distance prediction in CASP13.

机构信息

Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri.

Department of Computer Science, Pacific Lutheran University, Tacoma, Washington.

出版信息

Proteins. 2019 Dec;87(12):1165-1178. doi: 10.1002/prot.25697. Epub 2019 Apr 25.

Abstract

Predicting residue-residue distance relationships (eg, contacts) has become the key direction to advance protein structure prediction since 2014 CASP11 experiment, while deep learning has revolutionized the technology for contact and distance distribution prediction since its debut in 2012 CASP10 experiment. During 2018 CASP13 experiment, we enhanced our MULTICOM protein structure prediction system with three major components: contact distance prediction based on deep convolutional neural networks, distance-driven template-free (ab initio) modeling, and protein model ranking empowered by deep learning and contact prediction. Our experiment demonstrates that contact distance prediction and deep learning methods are the key reasons that MULTICOM was ranked 3rd out of all 98 predictors in both template-free and template-based structure modeling in CASP13. Deep convolutional neural network can utilize global information in pairwise residue-residue features such as coevolution scores to substantially improve contact distance prediction, which played a decisive role in correctly folding some free modeling and hard template-based modeling targets. Deep learning also successfully integrated one-dimensional structural features, two-dimensional contact information, and three-dimensional structural quality scores to improve protein model quality assessment, where the contact prediction was demonstrated to consistently enhance ranking of protein models for the first time. The success of MULTICOM system clearly shows that protein contact distance prediction and model selection driven by deep learning holds the key of solving protein structure prediction problem. However, there are still challenges in accurately predicting protein contact distance when there are few homologous sequences, folding proteins from noisy contact distances, and ranking models of hard targets.

摘要

自 2014 年 CASP11 实验以来,预测残基残基距离关系(例如接触)已成为推进蛋白质结构预测的关键方向,而深度学习自 2012 年 CASP10 实验首次亮相以来,彻底改变了接触和距离分布预测技术。在 2018 年 CASP13 实验中,我们通过三个主要组件增强了 MULTICOM 蛋白质结构预测系统:基于深度卷积神经网络的接触距离预测、距离驱动的无模板(从头开始)建模以及由深度学习和接触预测提供支持的蛋白质模型排序。我们的实验表明,接触距离预测和深度学习方法是 MULTICOM 在 CASP13 中无模板和基于模板的结构建模中均排名第 3 的所有 98 个预测器中的关键原因。深度卷积神经网络可以利用成对残基残基特征(如共进化分数)中的全局信息,大大提高接触距离预测的准确性,这对正确折叠一些自由建模和硬模板建模目标起到了决定性的作用。深度学习还成功地整合了一维结构特征、二维接触信息和三维结构质量分数,以提高蛋白质模型质量评估,首次证明接触预测始终可以增强蛋白质模型的排序。MULTICOM 系统的成功清楚地表明,基于深度学习的蛋白质接触距离预测和模型选择是解决蛋白质结构预测问题的关键。但是,当同源序列较少、从嘈杂的接触距离折叠蛋白质以及对硬目标模型进行排序时,仍然存在准确预测蛋白质接触距离的挑战。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e92b/6899478/54955b983da1/PROT-87-1165-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验