Suppr超能文献

FUpred:基于深度学习的接触图预测的蛋白质结构域检测。

FUpred: detecting protein domains through deep-learning-based contact map prediction.

机构信息

Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109.

Computer Science and Engineering Department, Michigan State University, East Lansing, MI 48824, USA.

出版信息

Bioinformatics. 2020 Jun 1;36(12):3749-3757. doi: 10.1093/bioinformatics/btaa217.

Abstract

MOTIVATION

Protein domains are subunits that can fold and function independently. Correct domain boundary assignment is thus a critical step toward accurate protein structure and function analyses. There is, however, no efficient algorithm available for accurate domain prediction from sequence. The problem is particularly challenging for proteins with discontinuous domains, which consist of domain segments that are separated along the sequence.

RESULTS

We developed a new algorithm, FUpred, which predicts protein domain boundaries utilizing contact maps created by deep residual neural networks coupled with coevolutionary precision matrices. The core idea of the algorithm is to retrieve domain boundary locations by maximizing the number of intra-domain contacts, while minimizing the number of inter-domain contacts from the contact maps. FUpred was tested on a large-scale dataset consisting of 2549 proteins and generated correct single- and multi-domain classifications with a Matthew's correlation coefficient of 0.799, which was 19.1% (or 5.3%) higher than the best machine learning (or threading)-based method. For proteins with discontinuous domains, the domain boundary detection and normalized domain overlapping scores of FUpred were 0.788 and 0.521, respectively, which were 17.3% and 23.8% higher than the best control method. The results demonstrate a new avenue to accurately detect domain composition from sequence alone, especially for discontinuous, multi-domain proteins.

AVAILABILITY AND IMPLEMENTATION

https://zhanglab.ccmb.med.umich.edu/FUpred.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

蛋白质结构域是可以独立折叠和发挥功能的亚基。因此,正确的结构域边界分配是准确分析蛋白质结构和功能的关键步骤。然而,目前还没有有效的算法可以从序列中准确预测结构域。对于具有不连续结构域的蛋白质,该问题尤其具有挑战性,因为这些蛋白质的结构域由沿着序列分离的结构域片段组成。

结果

我们开发了一种新算法 FUpred,该算法利用深度残差神经网络与共进化精确矩阵创建的接触图来预测蛋白质结构域边界。该算法的核心思想是通过最大化结构域内接触的数量,同时最小化接触图中外结构域接触的数量来检索结构域边界位置。FUpred 在一个包含 2549 个蛋白质的大规模数据集上进行了测试,生成的单结构域和多结构域分类的马修斯相关系数为 0.799,比最佳机器学习(或线程)方法高 19.1%(或 5.3%)。对于具有不连续结构域的蛋白质,FUpred 的结构域边界检测和归一化结构域重叠得分分别为 0.788 和 0.521,比最佳对照方法高 17.3%和 23.8%。结果表明了一种从序列中准确检测结构域组成的新途径,特别是对于不连续的多结构域蛋白质。

可用性和实施情况

https://zhanglab.ccmb.med.umich.edu/FUpred。

补充信息

补充数据可在 Bioinformatics 在线获取。

相似文献

10
Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model.基于超深度学习模型的蛋白质接触图从头精确预测
PLoS Comput Biol. 2017 Jan 5;13(1):e1005324. doi: 10.1371/journal.pcbi.1005324. eCollection 2017 Jan.

引用本文的文献

本文引用的文献

8
An ambiguity principle for assigning protein structural domains.一种用于分配蛋白质结构域的不明确性原理。
Sci Adv. 2017 Jan 13;3(1):e1600552. doi: 10.1126/sciadv.1600552. eCollection 2017 Jan.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验