• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

RFPR-IDP:通过同时纳入完全有序的蛋白质和无序的蛋白质,降低内在无序蛋白质和区域预测的假阳性率。

RFPR-IDP: reduce the false positive rates for intrinsically disordered protein and region prediction by incorporating both fully ordered proteins and disordered proteins.

机构信息

School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong 518055, China.

School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China.

出版信息

Brief Bioinform. 2021 Mar 22;22(2):2000-2011. doi: 10.1093/bib/bbaa018.

DOI:10.1093/bib/bbaa018
PMID:32112084
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7986600/
Abstract

As an important type of proteins, intrinsically disordered proteins/regions (IDPs/IDRs) are related to many crucial biological functions. Accurate prediction of IDPs/IDRs is beneficial to the prediction of protein structures and functions. Most of the existing methods ignore the fully ordered proteins without IDRs during training and test processes. As a result, the corresponding predictors prefer to predict the fully ordered proteins as disordered proteins. Unfortunately, these methods were only evaluated on datasets consisting of disordered proteins without or with only a few fully ordered proteins, and therefore, this problem escapes the attention of the researchers. However, most of the newly sequenced proteins are fully ordered proteins in nature. These predictors fail to accurately predict the ordered and disordered proteins in real-world applications. In this regard, we propose a new method called RFPR-IDP trained with both fully ordered proteins and disordered proteins, which is constructed based on the combination of convolution neural network (CNN) and bidirectional long short-term memory (BiLSTM). The experimental results show that although the existing predictors perform well for predicting the disordered proteins, they tend to predict the fully ordered proteins as disordered proteins. In contrast, the RFPR-IDP predictor can correctly predict the fully ordered proteins and outperform the other 10 state-of-the-art methods when evaluated on a test dataset with both fully ordered proteins and disordered proteins. The web server and datasets of RFPR-IDP are freely available at http://bliulab.net/RFPR-IDP/server.

摘要

作为蛋白质的重要类型之一,无规卷曲蛋白质/区域(IDPs/IDRs)与许多关键的生物功能有关。准确预测 IDPs/IDRs 有助于预测蛋白质结构和功能。大多数现有的方法在训练和测试过程中忽略了没有 IDRs 的完全有序蛋白质。因此,相应的预测器更倾向于将完全有序的蛋白质预测为无序的蛋白质。不幸的是,这些方法仅在由无规卷曲蛋白质组成的数据集或只有少数完全有序蛋白质的数据集上进行了评估,因此,这个问题没有引起研究人员的注意。然而,自然界中大多数新测序的蛋白质都是完全有序的蛋白质。这些预测器在实际应用中无法准确地预测有序和无序的蛋白质。在这方面,我们提出了一种新的方法,称为 RFPR-IDP,它使用完全有序的蛋白质和无规卷曲的蛋白质进行训练,该方法是基于卷积神经网络(CNN)和双向长短期记忆(BiLSTM)的组合构建的。实验结果表明,尽管现有的预测器在预测无规卷曲的蛋白质方面表现良好,但它们往往会将完全有序的蛋白质预测为无规卷曲的蛋白质。相比之下,RFPR-IDP 预测器可以正确地预测完全有序的蛋白质,并在评估同时包含完全有序的蛋白质和无规卷曲的蛋白质的测试数据集时,优于其他 10 种最先进的方法。RFPR-IDP 的网络服务器和数据集可在 http://bliulab.net/RFPR-IDP/server 免费获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/612c/7986600/3a3a56922ca6/bbaa018f9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/612c/7986600/993187e0f80c/bbaa018f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/612c/7986600/893c13f96ddd/bbaa018f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/612c/7986600/27843a9246c1/bbaa018f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/612c/7986600/f6db93f7b3e8/bbaa018f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/612c/7986600/47d5ef3627da/bbaa018f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/612c/7986600/0acf6bd0075f/bbaa018f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/612c/7986600/de2aaa59787b/bbaa018f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/612c/7986600/3217fda0e8a6/bbaa018f8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/612c/7986600/3a3a56922ca6/bbaa018f9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/612c/7986600/993187e0f80c/bbaa018f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/612c/7986600/893c13f96ddd/bbaa018f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/612c/7986600/27843a9246c1/bbaa018f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/612c/7986600/f6db93f7b3e8/bbaa018f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/612c/7986600/47d5ef3627da/bbaa018f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/612c/7986600/0acf6bd0075f/bbaa018f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/612c/7986600/de2aaa59787b/bbaa018f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/612c/7986600/3217fda0e8a6/bbaa018f8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/612c/7986600/3a3a56922ca6/bbaa018f9.jpg

相似文献

1
RFPR-IDP: reduce the false positive rates for intrinsically disordered protein and region prediction by incorporating both fully ordered proteins and disordered proteins.RFPR-IDP:通过同时纳入完全有序的蛋白质和无序的蛋白质,降低内在无序蛋白质和区域预测的假阳性率。
Brief Bioinform. 2021 Mar 22;22(2):2000-2011. doi: 10.1093/bib/bbaa018.
2
TransDFL: Identification of Disordered Flexible Linkers in Proteins by Transfer Learning.基于迁移学习的蛋白质无序柔性连接子识别
Genomics Proteomics Bioinformatics. 2023 Apr;21(2):359-369. doi: 10.1016/j.gpb.2022.10.004. Epub 2022 Oct 19.
3
IDP-Seq2Seq: identification of intrinsically disordered regions based on sequence to sequence learning.IDP-Seq2Seq:基于序列到序列学习的无规卷曲区域鉴定。
Bioinformatics. 2021 Jan 29;36(21):5177-5186. doi: 10.1093/bioinformatics/btaa667.
4
Identification of Intrinsically Disordered Proteins and Regions by Length-Dependent Predictors Based on Conditional Random Fields.基于条件随机场的长度依赖性预测器识别内在无序蛋白质及区域
Mol Ther Nucleic Acids. 2019 Sep 6;17:396-404. doi: 10.1016/j.omtn.2019.06.004. Epub 2019 Jun 15.
5
Protein intrinsically disordered region prediction by combining neural architecture search and multi-objective genetic algorithm.通过结合神经结构搜索和多目标遗传算法进行蛋白质无规则区域预测。
BMC Biol. 2023 Sep 7;21(1):188. doi: 10.1186/s12915-023-01672-5.
6
DeepIDP-2L: protein intrinsically disordered region prediction by combining convolutional attention network and hierarchical attention network.DeepIDP-2L:通过组合卷积注意网络和层次注意网络进行蛋白质固有无序区域预测。
Bioinformatics. 2022 Feb 7;38(5):1252-1260. doi: 10.1093/bioinformatics/btab810.
7
Accurate and Fast Prediction of Intrinsically Disordered Protein by Multiple Protein Language Models and Ensemble Learning.通过多种蛋白质语言模型和集成学习实现对固有无序蛋白质的准确快速预测。
J Chem Inf Model. 2024 Apr 8;64(7):2901-2911. doi: 10.1021/acs.jcim.3c01202. Epub 2023 Oct 26.
8
An assignment of intrinsically disordered regions of proteins based on NMR structures.基于 NMR 结构的蛋白质无规则区域分配。
J Struct Biol. 2013 Jan;181(1):29-36. doi: 10.1016/j.jsb.2012.10.017. Epub 2012 Nov 7.
9
MemDis: Predicting Disordered Regions in Transmembrane Proteins.MemDis:预测跨膜蛋白中的无序区域。
Int J Mol Sci. 2021 Nov 12;22(22):12270. doi: 10.3390/ijms222212270.
10
IDP⁻CRF: Intrinsically Disordered Protein/Region Identification Based on Conditional Random Fields.IDP⁻CRF:基于条件随机场的无序蛋白/区域识别。
Int J Mol Sci. 2018 Aug 22;19(9):2483. doi: 10.3390/ijms19092483.

引用本文的文献

1
FusionEncoder: identification of intrinsically disordered regions based on multi-feature fusion.融合编码器:基于多特征融合的内在无序区域识别
Bioinformatics. 2025 Jul 1;41(7). doi: 10.1093/bioinformatics/btaf362.
2
IDP-EDL: enhancing intrinsically disordered protein prediction by combining protein language model and ensemble deep learning.IDP-EDL:通过结合蛋白质语言模型和集成深度学习增强内在无序蛋白质预测
Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf182.
3
Comparative evaluation of AlphaFold2 and disorder predictors for prediction of intrinsic disorder, disorder content and fully disordered proteins.

本文引用的文献

1
Protein Fold Recognition by Combining Support Vector Machines and Pairwise Sequence Similarity Scores.利用支持向量机和序列两两相似得分相结合的蛋白质折叠识别方法。
IEEE/ACM Trans Comput Biol Bioinform. 2021 Sep-Oct;18(5):2008-2016. doi: 10.1109/TCBB.2020.2966450. Epub 2021 Oct 7.
2
MotifCNN-fold: protein fold recognition based on fold-specific features extracted by motif-based convolutional neural networks.MotifCNN-fold:基于基于模体的卷积神经网络提取的折叠特异特征的蛋白质折叠识别。
Brief Bioinform. 2020 Dec 1;21(6):2133-2141. doi: 10.1093/bib/bbz133.
3
DeepDRBP-2L: A New Genome Annotation Predictor for Identifying DNA-Binding Proteins and RNA-Binding Proteins Using Convolutional Neural Network and Long Short-Term Memory.
用于预测内在无序、无序含量和完全无序蛋白质的AlphaFold2与无序预测器的比较评估
Comput Struct Biotechnol J. 2023 Jun 2;21:3248-3258. doi: 10.1016/j.csbj.2023.06.001. eCollection 2023.
4
Tutorial: a guide for the selection of fast and accurate computational tools for the prediction of intrinsic disorder in proteins.教程:用于选择快速准确的计算工具预测蛋白质内无序性的指南。
Nat Protoc. 2023 Nov;18(11):3157-3172. doi: 10.1038/s41596-023-00876-x. Epub 2023 Sep 22.
5
Protein intrinsically disordered region prediction by combining neural architecture search and multi-objective genetic algorithm.通过结合神经结构搜索和多目标遗传算法进行蛋白质无规则区域预测。
BMC Biol. 2023 Sep 7;21(1):188. doi: 10.1186/s12915-023-01672-5.
6
TransDFL: Identification of Disordered Flexible Linkers in Proteins by Transfer Learning.基于迁移学习的蛋白质无序柔性连接子识别
Genomics Proteomics Bioinformatics. 2023 Apr;21(2):359-369. doi: 10.1016/j.gpb.2022.10.004. Epub 2022 Oct 19.
7
Protein Function Analysis through Machine Learning.基于机器学习的蛋白质功能分析。
Biomolecules. 2022 Sep 6;12(9):1246. doi: 10.3390/biom12091246.
8
Deep learning in prediction of intrinsic disorder in proteins.深度学习在蛋白质内在无序预测中的应用
Comput Struct Biotechnol J. 2022 Mar 8;20:1286-1294. doi: 10.1016/j.csbj.2022.03.003. eCollection 2022.
9
Identifying Intrinsically Disordered Protein Regions through a Deep Neural Network with Three Novel Sequence Features.通过具有三种新型序列特征的深度神经网络识别内在无序蛋白质区域。
Life (Basel). 2022 Feb 26;12(3):345. doi: 10.3390/life12030345.
10
Computational Prediction of Intrinsically Disordered Proteins Based on Protein Sequences and Convolutional Neural Networks.基于蛋白质序列和卷积神经网络的蛋白质无规则卷曲预测。
Comput Intell Neurosci. 2021 Dec 28;2021:4455604. doi: 10.1155/2021/4455604. eCollection 2021.
DeepDRBP-2L:一种新的基因组注释预测器,用于使用卷积神经网络和长短期记忆识别 DNA 结合蛋白和 RNA 结合蛋白。
IEEE/ACM Trans Comput Biol Bioinform. 2021 Jul-Aug;18(4):1451-1463. doi: 10.1109/TCBB.2019.2952338. Epub 2021 Aug 6.
4
DeepSVM-fold: protein fold recognition by combining support vector machines and pairwise sequence similarity scores generated by deep learning networks.DeepSVM-fold:通过结合支持向量机和深度学习网络生成的成对序列相似性得分来进行蛋白质折叠识别。
Brief Bioinform. 2020 Sep 25;21(5):1733-1741. doi: 10.1093/bib/bbz098.
5
iPromoter-2L2.0: Identifying Promoters and Their Types by Combining Smoothing Cutting Window Algorithm and Sequence-Based Features.iPromoter-2L2.0:结合平滑切割窗口算法和基于序列的特征识别启动子及其类型
Mol Ther Nucleic Acids. 2019 Dec 6;18:80-87. doi: 10.1016/j.omtn.2019.08.008. Epub 2019 Aug 14.
6
BioSeq-Analysis2.0: an updated platform for analyzing DNA, RNA and protein sequences at sequence level and residue level based on machine learning approaches.BioSeq-Analysis2.0:一个基于机器学习方法的更新平台,用于在序列水平和残基水平上分析 DNA、RNA 和蛋白质序列。
Nucleic Acids Res. 2019 Nov 18;47(20):e127. doi: 10.1093/nar/gkz740.
7
Identification of Intrinsically Disordered Proteins and Regions by Length-Dependent Predictors Based on Conditional Random Fields.基于条件随机场的长度依赖性预测器识别内在无序蛋白质及区域
Mol Ther Nucleic Acids. 2019 Sep 6;17:396-404. doi: 10.1016/j.omtn.2019.06.004. Epub 2019 Jun 15.
8
Disordered RNA chaperones can enhance nucleic acid folding via local charge screening.RNA 分子伴侣紊乱可通过局部电荷屏蔽增强核酸折叠。
Nat Commun. 2019 Jun 5;10(1):2453. doi: 10.1038/s41467-019-10356-0.
9
Prediction of Potential Disease-Associated MicroRNAs by Using Neural Networks.利用神经网络预测潜在的疾病相关微小RNA
Mol Ther Nucleic Acids. 2019 Jun 7;16:566-575. doi: 10.1016/j.omtn.2019.04.010. Epub 2019 Apr 18.
10
iLearn: an integrated platform and meta-learner for feature engineering, machine-learning analysis and modeling of DNA, RNA and protein sequence data.iLearn:一个集成平台和元学习者,用于 DNA、RNA 和蛋白质序列数据的特征工程、机器学习分析和建模。
Brief Bioinform. 2020 May 21;21(3):1047-1057. doi: 10.1093/bib/bbz041.