使用二维递归神经网络准确预测蛋白质中残基间的距离。

Toward an accurate prediction of inter-residue distances in proteins using 2D recursive neural networks.

机构信息

School of Computer Science and Informatics, Complex and Adaptive Systems Laboratory, University College Dublin, Belfield, Dublin 4, Ireland.

出版信息

BMC Bioinformatics. 2014 Jan 10;15:6. doi: 10.1186/1471-2105-15-6.

DOI:10.1186/1471-2105-15-6

PMID:24410833

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3893389/

Abstract

BACKGROUND

Protein inter-residue contact maps provide a translation and rotation invariant topological representation of a protein. They can be used as an intermediary step in protein structure predictions. However, the prediction of contact maps represents an unbalanced problem as far fewer examples of contacts than non-contacts exist in a protein structure.In this study we explore the possibility of completely eliminating the unbalanced nature of the contact map prediction problem by predicting real-value distances between residues. Predicting full inter-residue distance maps and applying them in protein structure predictions has been relatively unexplored in the past.

RESULTS

We initially demonstrate that the use of native-like distance maps is able to reproduce 3D structures almost identical to the targets, giving an average RMSD of 0.5Å. In addition, the corrupted physical maps with an introduced random error of ±6Å are able to reconstruct the targets within an average RMSD of 2Å.After demonstrating the reconstruction potential of distance maps, we develop two classes of predictors using two-dimensional recursive neural networks: an ab initio predictor that relies only on the protein sequence and evolutionary information, and a template-based predictor in which additional structural homology information is provided. We find that the ab initio predictor is able to reproduce distances with an RMSD of 6Å, regardless of the evolutionary content provided. Furthermore, we show that the template-based predictor exploits both sequence and structure information even in cases of dubious homology and outperforms the best template hit with a clear margin of up to 3.7Å.Lastly, we demonstrate the ability of the two predictors to reconstruct the CASP9 targets shorter than 200 residues producing the results similar to the state of the machine learning art approach implemented in the Distill server.

CONCLUSIONS

The methodology presented here, if complemented by more complex reconstruction protocols, can represent a possible path to improve machine learning algorithms for 3D protein structure prediction. Moreover, it can be used as an intermediary step in protein structure predictions either on its own or complemented by NMR restraints.

摘要

背景

蛋白质残基间接触图提供了蛋白质的平移和旋转不变的拓扑表示。它们可以作为蛋白质结构预测的中间步骤。然而，接触图的预测代表了一个不平衡的问题，因为在蛋白质结构中存在的接触比非接触少得多。在这项研究中，我们探索了通过预测残基之间的真实距离来完全消除接触图预测问题的不平衡性质的可能性。预测完整的残基间距离图并将其应用于蛋白质结构预测在过去相对较少被探索。

结果

我们最初证明，使用天然样的距离图能够复制几乎与目标相同的 3D 结构，平均 RMSD 为 0.5Å。此外，带有引入的±6Å 随机误差的损坏物理图能够在平均 RMSD 为 2Å 的范围内重建目标。在证明距离图的重建潜力后，我们使用二维递归神经网络开发了两类预测器：仅依赖于蛋白质序列和进化信息的从头预测器，以及提供额外结构同源性信息的基于模板的预测器。我们发现，无论提供的进化内容如何，从头预测器都能够以 RMSD 为 6Å 的精度重现距离。此外，我们表明，基于模板的预测器即使在可疑同源性的情况下也能利用序列和结构信息，并且以高达 3.7Å 的明显优势超过最佳模板命中。最后，我们展示了这两个预测器在重建 CASP9 目标时的能力，这些目标短于 200 个残基，产生的结果与在 Distill 服务器中实现的机器学习艺术方法的状态相似。

结论

如果辅以更复杂的重建协议，这里提出的方法可以代表改进用于 3D 蛋白质结构预测的机器学习算法的可能途径。此外，它可以作为蛋白质结构预测的中间步骤，无论是单独使用还是与 NMR 约束互补使用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf08/3893389/f669b8a55225/1471-2105-15-6-1.jpg

相似文献

Toward an accurate prediction of inter-residue distances in proteins using 2D recursive neural networks.

BMC Bioinformatics. 2014 Jan 10;15:6. doi: 10.1186/1471-2105-15-6.

Ab initio and template-based prediction of multi-class distance maps by two-dimensional recursive neural networks.

BMC Struct Biol. 2009 Jan 30;9:5. doi: 10.1186/1472-6807-9-5.

A two-stage approach for improved prediction of residue contact maps.

BMC Bioinformatics. 2006 Mar 30;7:180. doi: 10.1186/1471-2105-7-180.

Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model.

PLoS Comput Biol. 2017 Jan 5;13(1):e1005324. doi: 10.1371/journal.pcbi.1005324. eCollection 2017 Jan.

Ab initio and homology based prediction of protein domains by recursive neural networks.

BMC Bioinformatics. 2009 Jun 26;10:195. doi: 10.1186/1471-2105-10-195.

Protein tertiary structure modeling driven by deep learning and contact distance prediction in CASP13.

Proteins. 2019 Dec;87(12):1165-1178. doi: 10.1002/prot.25697. Epub 2019 Apr 25.

GDFuzz3D: a method for protein 3D structure reconstruction from contact maps, based on a non-Euclidean distance function.

Bioinformatics. 2015 Nov 1;31(21):3499-505. doi: 10.1093/bioinformatics/btv390. Epub 2015 Jun 30.

DeepDist: real-value inter-residue distance prediction with deep residual convolutional network.

BMC Bioinformatics. 2021 Jan 25;22(1):30. doi: 10.1186/s12859-021-03960-9.

DeepHelicon: Accurate prediction of inter-helical residue contacts in transmembrane proteins by residual neural networks.

J Struct Biol. 2020 Oct 1;212(1):107574. doi: 10.1016/j.jsb.2020.107574. Epub 2020 Jul 11.

Detecting distant-homology protein structures by aligning deep neural-network based contact maps.

PLoS Comput Biol. 2019 Oct 17;15(10):e1007411. doi: 10.1371/journal.pcbi.1007411. eCollection 2019 Oct.

引用本文的文献

Structure Modeling Protocols for Protein Multimer and RNA in CASP16 With Enhanced MSAs, Model Ranking, and Deep Learning.

Proteins. 2025 Aug 1. doi: 10.1002/prot.70033.

Contact-Assisted Threading in Low-Homology Protein Modeling.

Methods Mol Biol. 2023;2627:41-59. doi: 10.1007/978-1-0716-2974-1_3.

Computational insight into in silico analysis and molecular dynamics simulation of the dimer interface residues of ALS-linked hSOD1 forms in apo/holo states: a combined experimental and bioinformatic perspective.

3 Biotech. 2023 Mar;13(3):92. doi: 10.1007/s13205-023-03514-1. Epub 2023 Feb 21.

Structural analysis of SARS-CoV-2 Spike protein variants through graph embedding.

Netw Model Anal Health Inform Bioinform. 2023;12(1):3. doi: 10.1007/s13721-022-00397-9. Epub 2022 Dec 2.

Inter-Residue Distance Prediction From Duet Deep Learning Models.

Front Genet. 2022 May 16;13:887491. doi: 10.3389/fgene.2022.887491. eCollection 2022.

Ultrafast end-to-end protein structure prediction enables high-throughput exploration of uncharacterized proteins.

Proc Natl Acad Sci U S A. 2022 Jan 25;119(4). doi: 10.1073/pnas.2113348119.

Enhancing protein inter-residue real distance prediction by scrutinising deep learning models.

Sci Rep. 2022 Jan 17;12(1):787. doi: 10.1038/s41598-021-04441-y.

A fully open-source framework for deep learning protein real-valued distances.

Sci Rep. 2020 Aug 7;10(1):13374. doi: 10.1038/s41598-020-70181-0.

Deep learning methods in protein structure prediction.

Comput Struct Biotechnol J. 2020 Jan 22;18:1301-1310. doi: 10.1016/j.csbj.2019.12.011. eCollection 2020.

rawMSA: End-to-end Deep Learning using raw Multiple Sequence Alignments.

PLoS One. 2019 Aug 15;14(8):e0220182. doi: 10.1371/journal.pone.0220182. eCollection 2019.

本文引用的文献

Three-dimensional structures of membrane proteins from genomic sequencing.

Cell. 2012 Jun 22;149(7):1607-21. doi: 10.1016/j.cell.2012.04.012. Epub 2012 May 10.

Protein 3D structure computed from evolutionary sequence variation.

PLoS One. 2011;6(12):e28766. doi: 10.1371/journal.pone.0028766. Epub 2011 Dec 7.

PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments.

Bioinformatics. 2012 Jan 15;28(2):184-90. doi: 10.1093/bioinformatics/btr638. Epub 2011 Nov 17.

CASP9 results compared to those of previous CASP experiments.

Proteins. 2011;79 Suppl 10(0 10):196-207. doi: 10.1002/prot.23182. Epub 2011 Oct 14.

Using NMR chemical shifts as structural restraints in molecular dynamics simulations of proteins.

Structure. 2010 Aug 11;18(8):923-33. doi: 10.1016/j.str.2010.04.016.

PDBselect 1992-2009 and PDBfilter-select.

Nucleic Acids Res. 2010 Jan;38(Database issue):D318-9. doi: 10.1093/nar/gkp786. Epub 2009 Sep 25.

Improving physical realism, stereochemistry, and side-chain accuracy in homology modeling: Four approaches that performed well in CASP8.

Proteins. 2009;77 Suppl 9(Suppl 9):114-22. doi: 10.1002/prot.22570.

Fast and accurate predictions of protein NMR chemical shifts from interatomic distances.

J Am Chem Soc. 2009 Oct 7;131(39):13894-5. doi: 10.1021/ja903772t.

Assessment of domain boundary predictions and the prediction of intramolecular contacts in CASP8.

Proteins. 2009;77 Suppl 9:196-209. doi: 10.1002/prot.22554.

Beyond the Twilight Zone: automated prediction of structural properties of proteins by recursive neural networks and remote homology information.

Proteins. 2009 Oct;77(1):181-90. doi: 10.1002/prot.22429.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用二维递归神经网络准确预测蛋白质中残基间的距离。

Toward an accurate prediction of inter-residue distances in proteins using 2D recursive neural networks.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献