Suppr超能文献

使用二维递归神经网络准确预测蛋白质中残基间的距离。

Toward an accurate prediction of inter-residue distances in proteins using 2D recursive neural networks.

机构信息

School of Computer Science and Informatics, Complex and Adaptive Systems Laboratory, University College Dublin, Belfield, Dublin 4, Ireland.

出版信息

BMC Bioinformatics. 2014 Jan 10;15:6. doi: 10.1186/1471-2105-15-6.

Abstract

BACKGROUND

Protein inter-residue contact maps provide a translation and rotation invariant topological representation of a protein. They can be used as an intermediary step in protein structure predictions. However, the prediction of contact maps represents an unbalanced problem as far fewer examples of contacts than non-contacts exist in a protein structure.In this study we explore the possibility of completely eliminating the unbalanced nature of the contact map prediction problem by predicting real-value distances between residues. Predicting full inter-residue distance maps and applying them in protein structure predictions has been relatively unexplored in the past.

RESULTS

We initially demonstrate that the use of native-like distance maps is able to reproduce 3D structures almost identical to the targets, giving an average RMSD of 0.5Å. In addition, the corrupted physical maps with an introduced random error of ±6Å are able to reconstruct the targets within an average RMSD of 2Å.After demonstrating the reconstruction potential of distance maps, we develop two classes of predictors using two-dimensional recursive neural networks: an ab initio predictor that relies only on the protein sequence and evolutionary information, and a template-based predictor in which additional structural homology information is provided. We find that the ab initio predictor is able to reproduce distances with an RMSD of 6Å, regardless of the evolutionary content provided. Furthermore, we show that the template-based predictor exploits both sequence and structure information even in cases of dubious homology and outperforms the best template hit with a clear margin of up to 3.7Å.Lastly, we demonstrate the ability of the two predictors to reconstruct the CASP9 targets shorter than 200 residues producing the results similar to the state of the machine learning art approach implemented in the Distill server.

CONCLUSIONS

The methodology presented here, if complemented by more complex reconstruction protocols, can represent a possible path to improve machine learning algorithms for 3D protein structure prediction. Moreover, it can be used as an intermediary step in protein structure predictions either on its own or complemented by NMR restraints.

摘要

背景

蛋白质残基间接触图提供了蛋白质的平移和旋转不变的拓扑表示。它们可以作为蛋白质结构预测的中间步骤。然而,接触图的预测代表了一个不平衡的问题,因为在蛋白质结构中存在的接触比非接触少得多。在这项研究中,我们探索了通过预测残基之间的真实距离来完全消除接触图预测问题的不平衡性质的可能性。预测完整的残基间距离图并将其应用于蛋白质结构预测在过去相对较少被探索。

结果

我们最初证明,使用天然样的距离图能够复制几乎与目标相同的 3D 结构,平均 RMSD 为 0.5Å。此外,带有引入的±6Å 随机误差的损坏物理图能够在平均 RMSD 为 2Å 的范围内重建目标。在证明距离图的重建潜力后,我们使用二维递归神经网络开发了两类预测器:仅依赖于蛋白质序列和进化信息的从头预测器,以及提供额外结构同源性信息的基于模板的预测器。我们发现,无论提供的进化内容如何,从头预测器都能够以 RMSD 为 6Å 的精度重现距离。此外,我们表明,基于模板的预测器即使在可疑同源性的情况下也能利用序列和结构信息,并且以高达 3.7Å 的明显优势超过最佳模板命中。最后,我们展示了这两个预测器在重建 CASP9 目标时的能力,这些目标短于 200 个残基,产生的结果与在 Distill 服务器中实现的机器学习艺术方法的状态相似。

结论

如果辅以更复杂的重建协议,这里提出的方法可以代表改进用于 3D 蛋白质结构预测的机器学习算法的可能途径。此外,它可以作为蛋白质结构预测的中间步骤,无论是单独使用还是与 NMR 约束互补使用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf08/3893389/f669b8a55225/1471-2105-15-6-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验