Suppr超能文献

深度学习接触图指导的 CASP13 蛋白质结构预测。

Deep-learning contact-map guided protein structure prediction in CASP13.

机构信息

Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan.

School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China.

出版信息

Proteins. 2019 Dec;87(12):1149-1164. doi: 10.1002/prot.25792. Epub 2019 Aug 14.

Abstract

We report the results of two fully automated structure prediction pipelines, "Zhang-Server" and "QUARK", in CASP13. The pipelines were built upon the C-I-TASSER and C-QUARK programs, which in turn are based on I-TASSER and QUARK but with three new modules: (a) a novel multiple sequence alignment (MSA) generation protocol to construct deep sequence-profiles for contact prediction; (b) an improved meta-method, NeBcon, which combines multiple contact predictors, including ResPRE that predicts contact-maps by coupling precision-matrices with deep residual convolutional neural-networks; and (c) an optimized contact potential to guide structure assembly simulations. For 50 CASP13 FM domains that lacked homologous templates, average TM-scores of the first models produced by C-I-TASSER and C-QUARK were 28% and 56% higher than those constructed by I-TASSER and QUARK, respectively. For the first time, contact-map predictions demonstrated usefulness on TBM domains with close homologous templates, where TM-scores of C-I-TASSER models were significantly higher than those of I-TASSER models with a P-value <.05. Detailed data analyses showed that the success of C-I-TASSER and C-QUARK was mainly due to the increased accuracy of deep-learning-based contact-maps, as well as the careful balance between sequence-based contact restraints, threading templates, and generic knowledge-based potentials. Nevertheless, challenges still remain for predicting quaternary structure of multi-domain proteins, due to the difficulties in domain partitioning and domain reassembly. In addition, contact prediction in terminal regions was often unsatisfactory due to the sparsity of MSAs. Development of new contact-based domain partitioning and assembly methods and training contact models on sparse MSAs may help address these issues.

摘要

我们报告了在 CASP13 中两个完全自动化的结构预测管道“Zhang-Server”和“QUARK”的结果。这些管道是基于 C-I-TASSER 和 C-QUARK 程序构建的,而 C-I-TASSER 和 C-QUARK 则是基于 I-TASSER 和 QUARK,但增加了三个新模块:(a)一种新的多重序列比对(MSA)生成协议,用于构建用于接触预测的深度序列轮廓;(b)一种改进的元方法 NeBcon,它结合了多种接触预测器,包括 ResPRE,该方法通过将精度矩阵与深度残差卷积神经网络相结合来预测接触图;和(c)一种优化的接触势能,以指导结构组装模拟。对于缺少同源模板的 50 个 CASP13 FM 结构域,C-I-TASSER 和 C-QUARK 生成的第一个模型的平均 TM 评分比 I-TASSER 和 QUARK 分别高出 28%和 56%。这是首次在具有紧密同源模板的 TBM 结构域中证明接触图预测的有用性,其中 C-I-TASSER 模型的 TM 评分明显高于 I-TASSER 模型,具有统计学意义(P 值<.05)。详细的数据分析表明,C-I-TASSER 和 C-QUARK 的成功主要归因于基于深度学习的接触图的准确性提高,以及序列接触约束、线程模板和通用基于知识的势能之间的精心平衡。然而,由于在预测多结构域蛋白质的四级结构时,在结构域划分和结构域重组方面仍然存在挑战,因此仍然存在挑战。此外,由于 MSA 的稀疏性,末端区域的接触预测往往不尽人意。开发新的基于接触的结构域划分和组装方法,并在稀疏 MSA 上训练接触模型,可能有助于解决这些问题。

相似文献

1
Deep-learning contact-map guided protein structure prediction in CASP13.
Proteins. 2019 Dec;87(12):1149-1164. doi: 10.1002/prot.25792. Epub 2019 Aug 14.
2
Protein structure prediction using deep learning distance and hydrogen-bonding restraints in CASP14.
Proteins. 2021 Dec;89(12):1734-1751. doi: 10.1002/prot.26193. Epub 2021 Aug 7.
3
Template-based and free modeling of I-TASSER and QUARK pipelines using predicted contact maps in CASP12.
Proteins. 2018 Mar;86 Suppl 1(Suppl 1):136-151. doi: 10.1002/prot.25414. Epub 2017 Nov 14.
4
Integration of QUARK and I-TASSER for Ab Initio Protein Structure Prediction in CASP11.
Proteins. 2016 Sep;84 Suppl 1(Suppl 1):76-86. doi: 10.1002/prot.24930. Epub 2015 Sep 23.
5
Template-based protein structure prediction in CASP11 and retrospect of I-TASSER in the last decade.
Proteins. 2016 Sep;84 Suppl 1(Suppl 1):233-46. doi: 10.1002/prot.24918. Epub 2015 Sep 18.
6
7
Protein tertiary structure modeling driven by deep learning and contact distance prediction in CASP13.
Proteins. 2019 Dec;87(12):1165-1178. doi: 10.1002/prot.25697. Epub 2019 Apr 25.
8
Analysis of distance-based protein structure prediction by deep learning in CASP13.
Proteins. 2019 Dec;87(12):1069-1081. doi: 10.1002/prot.25810. Epub 2019 Sep 13.
9
Interplay of I-TASSER and QUARK for template-based and ab initio protein structure prediction in CASP10.
Proteins. 2014 Feb;82 Suppl 2(0 2):175-87. doi: 10.1002/prot.24341. Epub 2013 Aug 31.

引用本文的文献

2
Predicting RNA Structure Utilizing Attention from Pretrained Language Models.
J Chem Inf Model. 2025 Jul 14;65(13):6483-6498. doi: 10.1021/acs.jcim.4c02094. Epub 2025 Jul 2.
4
M-DeepAssembly: enhanced DeepAssembly based on multi-objective multi-domain protein conformation sampling.
BMC Bioinformatics. 2025 May 5;26(1):120. doi: 10.1186/s12859-025-06131-2.
5
AlphaFold2, SPINE-X, and Seder on Four Hard CASP Targets.
Methods Mol Biol. 2025;2867:141-152. doi: 10.1007/978-1-0716-4196-5_8.
7
Weighted families of contact maps to characterize conformational ensembles of (highly-)flexible proteins.
Bioinformatics. 2024 Nov 1;40(11). doi: 10.1093/bioinformatics/btae627.
9
Subtractive proteomics-based vaccine targets annotation and reverse vaccinology approaches to identify multiepitope vaccine against .
Heliyon. 2024 May 22;10(11):e31304. doi: 10.1016/j.heliyon.2024.e31304. eCollection 2024 Jun 15.
10
Recent Progress of Protein Tertiary Structure Prediction.
Molecules. 2024 Feb 13;29(4):832. doi: 10.3390/molecules29040832.

本文引用的文献

1
ResPRE: high-accuracy protein contact prediction by coupling precision matrix with deep residual neural networks.
Bioinformatics. 2019 Nov 1;35(22):4647-4655. doi: 10.1093/bioinformatics/btz291.
2
Clustering huge protein sequence sets in linear time.
Nat Commun. 2018 Jun 29;9(1):2542. doi: 10.1038/s41467-018-04964-5.
3
High precision in protein contact prediction using fully convolutional neural networks and minimal sequence features.
Bioinformatics. 2018 Oct 1;34(19):3308-3315. doi: 10.1093/bioinformatics/bty341.
4
Enhancing Evolutionary Couplings with Deep Convolutional Neural Networks.
Cell Syst. 2018 Jan 24;6(1):65-74.e3. doi: 10.1016/j.cels.2017.11.014. Epub 2017 Dec 20.
5
DNCON2: improved protein contact prediction using two-level deep convolutional neural networks.
Bioinformatics. 2018 May 1;34(9):1466-1472. doi: 10.1093/bioinformatics/btx781.
6
Evaluation of the template-based modeling in CASP12.
Proteins. 2018 Mar;86 Suppl 1(Suppl 1):321-334. doi: 10.1002/prot.25425. Epub 2017 Dec 4.
7
Assessment of hard target modeling in CASP12 reveals an emerging role of alignment-based contact prediction methods.
Proteins. 2018 Mar;86 Suppl 1:97-112. doi: 10.1002/prot.25423. Epub 2017 Nov 29.
8
Improved protein contact predictions with the MetaPSICOV2 server in CASP12.
Proteins. 2018 Mar;86 Suppl 1(Suppl Suppl 1):78-83. doi: 10.1002/prot.25379. Epub 2017 Sep 29.
9
Origins of coevolution between residues distant in protein 3D structures.
Proc Natl Acad Sci U S A. 2017 Aug 22;114(34):9122-9127. doi: 10.1073/pnas.1702664114. Epub 2017 Aug 7.
10
NeBcon: protein contact map prediction using neural network training coupled with naïve Bayes classifiers.
Bioinformatics. 2017 Aug 1;33(15):2296-2306. doi: 10.1093/bioinformatics/btx164.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验