Suppr超能文献

在 CASP15 中,通过深度学习、线程比对和多 MSAs 策略,实现高质量蛋白质单体和复合物结构预测。

Integrating deep learning, threading alignments, and a multi-MSA strategy for high-quality protein monomer and complex structure prediction in CASP15.

机构信息

Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA.

Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, USA.

出版信息

Proteins. 2023 Dec;91(12):1684-1703. doi: 10.1002/prot.26585. Epub 2023 Aug 31.

Abstract

We report the results of the "UM-TBM" and "Zheng" groups in CASP15 for protein monomer and complex structure prediction. These prediction sets were obtained using the D-I-TASSER and DMFold-Multimer algorithms, respectively. For monomer structure prediction, D-I-TASSER introduced four new features during CASP15: (i) a multiple sequence alignment (MSA) generation protocol that combines multi-source MSA searching and a structural modeling-based MSA ranker; (ii) attention-network based spatial restraints; (iii) a multi-domain module containing domain partition and arrangement for domain-level templates and spatial restraints; (iv) an optimized I-TASSER-based folding simulation system for full-length model creation guided by a combination of deep learning restraints, threading alignments, and knowledge-based potentials. For 47 free modeling targets in CASP15, the final models predicted by D-I-TASSER showed average TM-score 19% higher than the standard AlphaFold2 program. We thus showed that traditional Monte Carlo-based folding simulations, when appropriately coupled with deep learning algorithms, can generate models with improved accuracy over end-to-end deep learning methods alone. For protein complex structure prediction, DMFold-Multimer generated models by integrating a new MSA generation algorithm (DeepMSA2) with the end-to-end modeling module from AlphaFold2-Multimer. For the 38 complex targets, DMFold-Multimer generated models with an average TM-score of 0.83 and Interface Contact Score of 0.60, both significantly higher than those of competing complex prediction tools. Our analyses on complexes highlighted the critical role played by MSA generating, ranking, and pairing in protein complex structure prediction. We also discuss future room for improvement in the areas of viral protein modeling and complex model ranking.

摘要

我们报告了“UM-TBM”和“Zheng”小组在 CASP15 中的蛋白质单体和复合物结构预测结果。这些预测集分别使用 D-I-TASSER 和 DMFold-Multimer 算法获得。对于单体结构预测,D-I-TASSER 在 CASP15 中引入了四个新特性:(i)一种多序列比对(MSA)生成协议,它结合了多源 MSA 搜索和基于结构建模的 MSA 排名器;(ii)基于注意力网络的空间限制;(iii)一个多域模块,包含用于域级模板和空间限制的域划分和排列;(iv)一个优化的基于 I-TASSER 的折叠模拟系统,用于在深度学习限制、线程比对和基于知识的势能的组合指导下创建全长模型。对于 CASP15 中的 47 个自由建模目标,D-I-TASSER 预测的最终模型的平均 TM 分数比标准的 AlphaFold2 程序高 19%。因此,我们表明,当与深度学习算法适当结合时,传统的基于蒙特卡罗的折叠模拟可以生成比仅基于端到端深度学习方法更准确的模型。对于蛋白质复合物结构预测,DMFold-Multimer 通过将新的 MSA 生成算法(DeepMSA2)与来自 AlphaFold2-Multimer 的端到端建模模块集成,生成模型。对于 38 个复合物目标,DMFold-Multimer 生成的模型的平均 TM 分数为 0.83,界面接触分数为 0.60,均显著高于竞争复合物预测工具的分数。我们对复合物的分析强调了 MSA 生成、排名和配对在蛋白质复合物结构预测中的关键作用。我们还讨论了在病毒蛋白建模和复合物模型排名领域进一步改进的空间。

相似文献

8
Protein structure prediction by pro-Sp3-TASSER.通过亲Sp3-TASSER进行蛋白质结构预测。
Biophys J. 2009 Mar 18;96(6):2119-27. doi: 10.1016/j.bpj.2008.12.3898.

引用本文的文献

本文引用的文献

3
ColabFold: making protein folding accessible to all.ColabFold:让蛋白质折叠变得人人可用。
Nat Methods. 2022 Jun;19(6):679-682. doi: 10.1038/s41592-022-01488-1. Epub 2022 May 30.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验