Suppr超能文献

通过深度残差卷积网络从一组共进化矩阵中推导高精度蛋白质接触图。

Deducing high-accuracy protein contact-maps from a triplet of coevolutionary matrices through deep residual convolutional networks.

作者信息

Li Yang, Zhang Chengxin, Bell Eric W, Zheng Wei, Zhou Xiaogen, Yu Dong-Jun, Zhang Yang

机构信息

School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China.

Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America.

出版信息

PLoS Comput Biol. 2021 Mar 26;17(3):e1008865. doi: 10.1371/journal.pcbi.1008865. eCollection 2021 Mar.

Abstract

The topology of protein folds can be specified by the inter-residue contact-maps and accurate contact-map prediction can help ab initio structure folding. We developed TripletRes to deduce protein contact-maps from discretized distance profiles by end-to-end training of deep residual neural-networks. Compared to previous approaches, the major advantage of TripletRes is in its ability to learn and directly fuse a triplet of coevolutionary matrices extracted from the whole-genome and metagenome databases and therefore minimize the information loss during the course of contact model training. TripletRes was tested on a large set of 245 non-homologous proteins from CASP 11&12 and CAMEO experiments and outperformed other top methods from CASP12 by at least 58.4% for the CASP 11&12 targets and 44.4% for the CAMEO targets in the top-L long-range contact precision. On the 31 FM targets from the latest CASP13 challenge, TripletRes achieved the highest precision (71.6%) for the top-L/5 long-range contact predictions. It was also shown that a simple re-training of the TripletRes model with more proteins can lead to further improvement with precisions comparable to state-of-the-art methods developed after CASP13. These results demonstrate a novel efficient approach to extend the power of deep convolutional networks for high-accuracy medium- and long-range protein contact-map predictions starting from primary sequences, which are critical for constructing 3D structure of proteins that lack homologous templates in the PDB library.

摘要

蛋白质折叠的拓扑结构可以通过残基间接触图来确定,准确的接触图预测有助于从头进行结构折叠。我们开发了TripletRes,通过深度残差神经网络的端到端训练从离散化的距离分布中推断蛋白质接触图。与先前的方法相比,TripletRes的主要优势在于其能够学习并直接融合从全基因组和宏基因组数据库中提取的三联共进化矩阵,从而在接触模型训练过程中最小化信息损失。TripletRes在来自CASP 11&12和CAMEO实验的一大组245个非同源蛋白质上进行了测试,在top-L长程接触精度方面,对于CASP 11&12的目标,其性能比CASP12中的其他顶级方法至少高出58.4%,对于CAMEO目标则高出44.4%。在最新的CASP13挑战的31个FM目标上,TripletRes在top-L/5长程接触预测中达到了最高精度(71.6%)。研究还表明,使用更多蛋白质对TripletRes模型进行简单的重新训练可以进一步提高精度,与CASP13之后开发的最先进方法相当。这些结果展示了一种新颖有效的方法,可扩展深度卷积网络的能力,用于从一级序列开始进行高精度的中长程蛋白质接触图预测,这对于构建PDB库中缺乏同源模板的蛋白质的三维结构至关重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8055/8026059/75516792b8b7/pcbi.1008865.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验