Suppr超能文献

DeepGOPlus:从序列中改进蛋白质功能预测。

DeepGOPlus: improved protein function prediction from sequence.

机构信息

Computational Bioscience Research Center, Computer, Electrical and Mathematical Sciences & Engineering Division, King Abdullah University of Science and Technology, Thuwal 23955, Saudi Arabia.

出版信息

Bioinformatics. 2020 Jan 15;36(2):422-429. doi: 10.1093/bioinformatics/btz595.

Abstract

MOTIVATION

Protein function prediction is one of the major tasks of bioinformatics that can help in wide range of biological problems such as understanding disease mechanisms or finding drug targets. Many methods are available for predicting protein functions from sequence based features, protein-protein interaction networks, protein structure or literature. However, other than sequence, most of the features are difficult to obtain or not available for many proteins thereby limiting their scope. Furthermore, the performance of sequence-based function prediction methods is often lower than methods that incorporate multiple features and predicting protein functions may require a lot of time.

RESULTS

We developed a novel method for predicting protein functions from sequence alone which combines deep convolutional neural network (CNN) model with sequence similarity based predictions. Our CNN model scans the sequence for motifs which are predictive for protein functions and combines this with functions of similar proteins (if available). We evaluate the performance of DeepGOPlus using the CAFA3 evaluation measures and achieve an Fmax of 0.390, 0.557 and 0.614 for BPO, MFO and CCO evaluations, respectively. These results would have made DeepGOPlus one of the three best predictors in CCO and the second best performing method in the BPO and MFO evaluations. We also compare DeepGOPlus with state-of-the-art methods such as DeepText2GO and GOLabeler on another dataset. DeepGOPlus can annotate around 40 protein sequences per second on common hardware, thereby making fast and accurate function predictions available for a wide range of proteins.

AVAILABILITY AND IMPLEMENTATION

http://deepgoplus.bio2vec.net/ .

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

蛋白质功能预测是生物信息学的主要任务之一,可帮助解决广泛的生物学问题,如了解疾病机制或寻找药物靶点。有许多方法可用于根据序列特征、蛋白质-蛋白质相互作用网络、蛋白质结构或文献预测蛋白质功能。然而,除了序列之外,大多数特征都难以获得或对许多蛋白质不可用,从而限制了它们的应用范围。此外,基于序列的功能预测方法的性能通常低于结合多种特征的方法,而且预测蛋白质功能可能需要大量时间。

结果

我们开发了一种从序列本身预测蛋白质功能的新方法,该方法将深度卷积神经网络(CNN)模型与基于序列相似性的预测相结合。我们的 CNN 模型扫描序列以寻找对蛋白质功能具有预测性的基序,并将其与类似蛋白质的功能(如果可用)相结合。我们使用 CAFA3 评估指标评估 DeepGOPlus 的性能,分别在 BPO、MFO 和 CCO 评估中获得 0.390、0.557 和 0.614 的 Fmax。这些结果将使 DeepGOPlus 成为 CCO 中三个最佳预测器之一,以及 BPO 和 MFO 评估中表现第二好的方法。我们还在另一个数据集上比较了 DeepGOPlus 与最先进的方法,如 DeepText2GO 和 GOLabeler。DeepGOPlus 可以在普通硬件上每秒注释大约 40 个蛋白质序列,从而为广泛的蛋白质提供快速准确的功能预测。

可用性和实现

http://deepgoplus.bio2vec.net/。

补充信息

补充数据可在 Bioinformatics 在线获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6454/9883727/5b55c94a9cf3/btz595f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验