FilterDCA：基于域间共进化的可解释监督接触预测

FilterDCA: Interpretable supervised contact prediction using inter-domain coevolution.

机构信息

Sorbonne Université, CNRS, Institut de Biologie Paris Seine, Biologie Computationnelle et Quantitative - LCQB, 75005 Paris, France.

出版信息

PLoS Comput Biol. 2020 Oct 9;16(10):e1007621. doi: 10.1371/journal.pcbi.1007621. eCollection 2020 Oct.

DOI:10.1371/journal.pcbi.1007621

PMID:33035205

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7577475/

Abstract

Predicting three-dimensional protein structure and assembling protein complexes using sequence information belongs to the most prominent tasks in computational biology. Recently substantial progress has been obtained in the case of single proteins using a combination of unsupervised coevolutionary sequence analysis with structurally supervised deep learning. While reaching impressive accuracies in predicting residue-residue contacts, deep learning has a number of disadvantages. The need for large structural training sets limits the applicability to multi-protein complexes; and their deep architecture makes the interpretability of the convolutional neural networks intrinsically hard. Here we introduce FilterDCA, a simpler supervised predictor for inter-domain and inter-protein contacts. It is based on the fact that contact maps of proteins show typical contact patterns, which results from secondary structure and are reflected by patterns in coevolutionary analysis. We explicitly integrate averaged contacts patterns with coevolutionary scores derived by Direct Coupling Analysis, improving performance over standard coevolutionary analysis, while remaining fully transparent and interpretable. The FilterDCA code is available at http://gitlab.lcqb.upmc.fr/muscat/FilterDCA.

摘要

利用序列信息预测三维蛋白质结构和组装蛋白质复合物属于计算生物学中最突出的任务之一。最近，在使用无监督共进化序列分析与结构监督深度学习相结合的情况下，在单个蛋白质的情况下取得了实质性进展。尽管在预测残基-残基接触方面取得了令人印象深刻的准确性，但深度学习有许多缺点。对大型结构训练集的需求限制了其在多蛋白复合物中的适用性；而且其深层架构使得卷积神经网络的可解释性本质上很困难。在这里，我们引入了 FilterDCA，这是一种用于域间和蛋白质间接触的更简单的监督预测器。它基于这样一个事实，即蛋白质的接触图显示出典型的接触模式，这些模式是由二级结构产生的，并反映在共进化分析中。我们明确地将平均接触模式与直接耦合分析得出的共进化分数结合起来，从而提高了标准共进化分析的性能，同时仍然完全透明和可解释。FilterDCA 代码可在 http://gitlab.lcqb.upmc.fr/muscat/FilterDCA 获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/526f/7577475/258eb4d30471/pcbi.1007621.g001.jpg

相似文献

FilterDCA: Interpretable supervised contact prediction using inter-domain coevolution.FilterDCA：基于域间共进化的可解释监督接触预测

PLoS Comput Biol. 2020 Oct 9;16(10):e1007621. doi: 10.1371/journal.pcbi.1007621. eCollection 2020 Oct.

Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model.基于超深度学习模型的蛋白质接触图从头精确预测

PLoS Comput Biol. 2017 Jan 5;13(1):e1005324. doi: 10.1371/journal.pcbi.1005324. eCollection 2017 Jan.

DeepHelicon: Accurate prediction of inter-helical residue contacts in transmembrane proteins by residual neural networks.DeepHelicon：通过残差神经网络准确预测跨膜蛋白中螺旋间残基接触。

J Struct Biol. 2020 Oct 1;212(1):107574. doi: 10.1016/j.jsb.2020.107574. Epub 2020 Jul 11.

Protein tertiary structure modeling driven by deep learning and contact distance prediction in CASP13.基于深度学习的蛋白质三级结构建模和 CASP13 中的接触距离预测。

Proteins. 2019 Dec;87(12):1165-1178. doi: 10.1002/prot.25697. Epub 2019 Apr 25.

Protein contact prediction by integrating deep multiple sequence alignments, coevolution and machine learning.通过整合深度多序列比对、协同进化和机器学习进行蛋白质接触预测。

Proteins. 2018 Mar;86 Suppl 1(Suppl 1):84-96. doi: 10.1002/prot.25405. Epub 2017 Oct 31.

Ensembling multiple raw coevolutionary features with deep residual neural networks for contact-map prediction in CASP13.基于深度残差神经网络的原始共进化特征集成方法在 CASP13 中用于接触图预测。

Proteins. 2019 Dec;87(12):1082-1091. doi: 10.1002/prot.25798. Epub 2019 Aug 22.

DNCON2: improved protein contact prediction using two-level deep convolutional neural networks.DNCON2：使用两级深度卷积神经网络改进蛋白质接触预测。

Bioinformatics. 2018 May 1;34(9):1466-1472. doi: 10.1093/bioinformatics/btx781.

Analysis of deep learning methods for blind protein contact prediction in CASP12.CASP12中用于蛋白质盲态接触预测的深度学习方法分析

Proteins. 2018 Mar;86 Suppl 1(Suppl 1):67-77. doi: 10.1002/prot.25377. Epub 2017 Sep 6.

Coevolutionary Analysis of Protein Sequences for Molecular Modeling.用于分子建模的蛋白质序列共进化分析

Methods Mol Biol. 2019;2022:379-397. doi: 10.1007/978-1-4939-9608-7_16.

PCP-GC-LM: single-sequence-based protein contact prediction using dual graph convolutional neural network and convolutional neural network.PCP-GC-LM：基于双图卷积神经网络和卷积神经网络的单序列蛋白质接触预测。

BMC Bioinformatics. 2024 Sep 2;25(1):287. doi: 10.1186/s12859-024-05914-3.

引用本文的文献

Generating interacting protein sequences using domain-to-domain translation.使用域到域翻译生成相互作用的蛋白质序列。

Bioinformatics. 2023 Jul 1;39(7). doi: 10.1093/bioinformatics/btad401.

Characterizing interactions in E-cadherin assemblages.表征 E-钙黏蛋白组装体中的相互作用。

Biophys J. 2023 Aug 8;122(15):3069-3077. doi: 10.1016/j.bpj.2023.06.009. Epub 2023 Jun 21.

Inverse Potts model improves accuracy of phylogenetic profiling.反泊松分布模型提高了系统发育轮廓预测的准确性。

Bioinformatics. 2022 Mar 28;38(7):1794-1800. doi: 10.1093/bioinformatics/btac034.

CoCoNet-boosting RNA contact prediction by convolutional neural networks.基于卷积神经网络的 CoCoNet 增强 RNA 接触预测。

Nucleic Acids Res. 2021 Dec 16;49(22):12661-12672. doi: 10.1093/nar/gkab1144.

Evaluation of residue-residue contact prediction methods: From retrospective to prospective.评估残基残基接触预测方法：从回顾性到前瞻性。

PLoS Comput Biol. 2021 May 24;17(5):e1009027. doi: 10.1371/journal.pcbi.1009027. eCollection 2021 May.

ELIHKSIR Web Server: Evolutionary Links Inferred for Histidine Kinase Sensors Interacting with Response Regulators.ELIHKSIR网络服务器：与响应调节因子相互作用的组氨酸激酶传感器的进化联系推断

Entropy (Basel). 2021 Jan 30;23(2):170. doi: 10.3390/e23020170.

本文引用的文献

Improved protein structure prediction using potentials from deep learning.利用深度学习势进行蛋白质结构预测的改进。

Nature. 2020 Jan;577(7792):706-710. doi: 10.1038/s41586-019-1923-7. Epub 2020 Jan 15.

Protein structure prediction using multiple deep neural networks in the 13th Critical Assessment of Protein Structure Prediction (CASP13).使用多个深度神经网络进行蛋白质结构预测在第十三届蛋白质结构预测关键评估 (CASP13) 中。

Proteins. 2019 Dec;87(12):1141-1148. doi: 10.1002/prot.25834.

PconsC4: fast, accurate and hassle-free contact predictions.PconsC4：快速、准确、无麻烦的接触预测。

Bioinformatics. 2019 Aug 1;35(15):2677-2679. doi: 10.1093/bioinformatics/bty1036.

Predicting protein-protein interactions through sequence-based deep learning.基于序列的深度学习预测蛋白质-蛋白质相互作用。

Bioinformatics. 2018 Sep 1;34(17):i802-i810. doi: 10.1093/bioinformatics/bty573.

The Pfam protein families database in 2019.2019 年 Pfam 蛋白质家族数据库。

Nucleic Acids Res. 2019 Jan 8;47(D1):D427-D432. doi: 10.1093/nar/gky995.

High precision in protein contact prediction using fully convolutional neural networks and minimal sequence features.利用全卷积神经网络和最小序列特征进行高精度蛋白质接触预测。

Bioinformatics. 2018 Oct 1;34(19):3308-3315. doi: 10.1093/bioinformatics/bty341.

Origins of coevolution between residues distant in protein 3D structures.蛋白质三维结构中相距较远残基的共进化起源。

Proc Natl Acad Sci U S A. 2017 Aug 22;114(34):9122-9127. doi: 10.1073/pnas.1702664114. Epub 2017 Aug 7.

Protein Data Bank (PDB): The Single Global Macromolecular Structure Archive.蛋白质数据库（PDB）：单一的全球大分子结构存档库。

Methods Mol Biol. 2017;1607:627-641. doi: 10.1007/978-1-4939-7000-1_26.

Modeling Hsp70/Hsp40 interaction by multi-scale molecular simulations and coevolutionary sequence analysis.通过多尺度分子模拟和共进化序列分析对Hsp70/Hsp40相互作用进行建模。

Elife. 2017 May 12;6:e23471. doi: 10.7554/eLife.23471.

Large-scale identification of coevolution signals across homo-oligomeric protein interfaces by direct coupling analysis.通过直接耦联分析大规模鉴定同源寡聚蛋白界面的共进化信号。

Proc Natl Acad Sci U S A. 2017 Mar 28;114(13):E2662-E2671. doi: 10.1073/pnas.1615068114. Epub 2017 Mar 13.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

FilterDCA：基于域间共进化的可解释监督接触预测

FilterDCA: Interpretable supervised contact prediction using inter-domain coevolution.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献