Suppr超能文献

RFPR-IDP:通过同时纳入完全有序的蛋白质和无序的蛋白质,降低内在无序蛋白质和区域预测的假阳性率。

RFPR-IDP: reduce the false positive rates for intrinsically disordered protein and region prediction by incorporating both fully ordered proteins and disordered proteins.

机构信息

School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong 518055, China.

School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China.

出版信息

Brief Bioinform. 2021 Mar 22;22(2):2000-2011. doi: 10.1093/bib/bbaa018.

Abstract

As an important type of proteins, intrinsically disordered proteins/regions (IDPs/IDRs) are related to many crucial biological functions. Accurate prediction of IDPs/IDRs is beneficial to the prediction of protein structures and functions. Most of the existing methods ignore the fully ordered proteins without IDRs during training and test processes. As a result, the corresponding predictors prefer to predict the fully ordered proteins as disordered proteins. Unfortunately, these methods were only evaluated on datasets consisting of disordered proteins without or with only a few fully ordered proteins, and therefore, this problem escapes the attention of the researchers. However, most of the newly sequenced proteins are fully ordered proteins in nature. These predictors fail to accurately predict the ordered and disordered proteins in real-world applications. In this regard, we propose a new method called RFPR-IDP trained with both fully ordered proteins and disordered proteins, which is constructed based on the combination of convolution neural network (CNN) and bidirectional long short-term memory (BiLSTM). The experimental results show that although the existing predictors perform well for predicting the disordered proteins, they tend to predict the fully ordered proteins as disordered proteins. In contrast, the RFPR-IDP predictor can correctly predict the fully ordered proteins and outperform the other 10 state-of-the-art methods when evaluated on a test dataset with both fully ordered proteins and disordered proteins. The web server and datasets of RFPR-IDP are freely available at http://bliulab.net/RFPR-IDP/server.

摘要

作为蛋白质的重要类型之一,无规卷曲蛋白质/区域(IDPs/IDRs)与许多关键的生物功能有关。准确预测 IDPs/IDRs 有助于预测蛋白质结构和功能。大多数现有的方法在训练和测试过程中忽略了没有 IDRs 的完全有序蛋白质。因此,相应的预测器更倾向于将完全有序的蛋白质预测为无序的蛋白质。不幸的是,这些方法仅在由无规卷曲蛋白质组成的数据集或只有少数完全有序蛋白质的数据集上进行了评估,因此,这个问题没有引起研究人员的注意。然而,自然界中大多数新测序的蛋白质都是完全有序的蛋白质。这些预测器在实际应用中无法准确地预测有序和无序的蛋白质。在这方面,我们提出了一种新的方法,称为 RFPR-IDP,它使用完全有序的蛋白质和无规卷曲的蛋白质进行训练,该方法是基于卷积神经网络(CNN)和双向长短期记忆(BiLSTM)的组合构建的。实验结果表明,尽管现有的预测器在预测无规卷曲的蛋白质方面表现良好,但它们往往会将完全有序的蛋白质预测为无规卷曲的蛋白质。相比之下,RFPR-IDP 预测器可以正确地预测完全有序的蛋白质,并在评估同时包含完全有序的蛋白质和无规卷曲的蛋白质的测试数据集时,优于其他 10 种最先进的方法。RFPR-IDP 的网络服务器和数据集可在 http://bliulab.net/RFPR-IDP/server 免费获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/612c/7986600/993187e0f80c/bbaa018f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验