MoRFPred_en：使用集成学习策略基于序列预测莫尔费（MoRFs）。

MoRFPred_en: Sequence-based prediction of MoRFs using an ensemble learning strategy.

作者信息

Fang Chun, Moriwaki Yoshitaka, Li Caihong, Shimizu Kentaro

机构信息

Department of Computer Science and Engineering, Shandong University of Technology, Shandong 255049, P. R. China.

Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo 113-8657, Japan.

出版信息

J Bioinform Comput Biol. 2019 Dec;17(6):1940015. doi: 10.1142/S0219720019400158.

DOI:10.1142/S0219720019400158

PMID:32019410

Abstract

Molecular recognition features (MoRFs) usually act as "hub" sites in the interaction networks of intrinsically disordered proteins (IDPs). Because an increasing number of serious diseases have been found to be associated with disordered proteins, identifying MoRFs has become increasingly important. In this study, we propose an ensemble learning strategy, named MoRFPred_en, to predict MoRFs from protein sequences. This approach combines four submodels that utilize different sequence-derived features for the prediction, including a multichannel one-dimensional convolutional neural network (CNN_1D multichannel) based model, two deep two-dimensional convolutional neural network (DCNN_2D) based models, and a support vector machine (SVM) based model. When compared with other methods on the same datasets, the MoRFPred_en approach produced better results than existing state-of-the-art MoRF prediction methods, achieving an AUC of 0.762 on the VALIDATION419 dataset, 0.795 on the TEST45 dataset, and 0.776 on the TEST49 dataset. Availability: http://vivace.bi.a.u-tokyo.ac.jp:8008/fang/MoRFPred_en.php.

摘要

分子识别特征（MoRFs）通常在内在无序蛋白（IDPs）的相互作用网络中充当“枢纽”位点。由于已发现越来越多的严重疾病与无序蛋白相关，识别MoRFs变得越来越重要。在本研究中，我们提出了一种名为MoRFPred_en的集成学习策略，用于从蛋白质序列中预测MoRFs。该方法结合了四个利用不同序列衍生特征进行预测的子模型，包括基于多通道一维卷积神经网络（CNN_1D多通道）的模型、两个基于深度二维卷积神经网络（DCNN_2D）的模型以及一个基于支持向量机（SVM）的模型。在相同数据集上与其他方法进行比较时，MoRFPred_en方法比现有的最先进的MoRF预测方法产生了更好的结果，在VALIDATION419数据集上的AUC为0.762，在TEST45数据集上为0.795，在TEST49数据集上为0.776。可用性：http://vivace.bi.a.u-tokyo.ac.jp:8008/fang/MoRFPred_en.php。

相似文献

MoRFPred_en: Sequence-based prediction of MoRFs using an ensemble learning strategy.

J Bioinform Comput Biol. 2019 Dec;17(6):1940015. doi: 10.1142/S0219720019400158.

Identifying short disorder-to-order binding regions in disordered proteins with a deep convolutional neural network method.

J Bioinform Comput Biol. 2019 Feb;17(1):1950004. doi: 10.1142/S0219720019500045.

MoRFPred-plus: Computational Identification of MoRFs in Protein Sequences using Physicochemical Properties and HMM profiles.

J Theor Biol. 2018 Jan 21;437:9-16. doi: 10.1016/j.jtbi.2017.10.015. Epub 2017 Oct 16.

MoRFpred, a computational tool for sequence-based prediction and characterization of short disorder-to-order transitioning binding regions in proteins.

Bioinformatics. 2012 Jun 15;28(12):i75-83. doi: 10.1093/bioinformatics/bts209.

Predicting MoRFs in protein sequences using HMM profiles.

BMC Bioinformatics. 2016 Dec 22;17(Suppl 19):504. doi: 10.1186/s12859-016-1375-0.

MFSPSSMpred: identifying short disorder-to-order binding regions in disordered proteins based on contextual local evolutionary conservation.

BMC Bioinformatics. 2013 Oct 4;14:300. doi: 10.1186/1471-2105-14-300.

Computational identification of MoRFs in protein sequences.

Bioinformatics. 2015 Jun 1;31(11):1738-44. doi: 10.1093/bioinformatics/btv060. Epub 2015 Jan 30.

Discovering MoRFs by trisecting intrinsically disordered protein sequence into terminals and middle regions.

BMC Bioinformatics. 2019 Feb 4;19(Suppl 13):378. doi: 10.1186/s12859-018-2396-7.

MoRF_ESM: Prediction of MoRFs in disordered proteins based on a deep transformer protein language model.

J Bioinform Comput Biol. 2024 Apr;22(2):2450006. doi: 10.1142/S0219720024500069. Epub 2024 May 28.

OPAL: prediction of MoRF regions in intrinsically disordered protein sequences.

Bioinformatics. 2018 Jun 1;34(11):1850-1858. doi: 10.1093/bioinformatics/bty032.

引用本文的文献

Comparative evaluation of AlphaFold2 and disorder predictors for prediction of intrinsic disorder, disorder content and fully disordered proteins.

Comput Struct Biotechnol J. 2023 Jun 2;21:3248-3258. doi: 10.1016/j.csbj.2023.06.001. eCollection 2023.

Computational prediction of disordered binding regions.

Comput Struct Biotechnol J. 2023 Feb 10;21:1487-1497. doi: 10.1016/j.csbj.2023.02.018. eCollection 2023.

Deep learning in prediction of intrinsic disorder in proteins.

Comput Struct Biotechnol J. 2022 Mar 8;20:1286-1294. doi: 10.1016/j.csbj.2022.03.003. eCollection 2022.

Intrinsically disordered proteins play diverse roles in cell signaling.

Cell Commun Signal. 2022 Feb 17;20(1):20. doi: 10.1186/s12964-022-00821-7.

On the Verge of Life: Distribution of Nucleotide Sequences in Viral RNAs.

Biosemiotics. 2021;14(2):253-269. doi: 10.1007/s12304-021-09403-5. Epub 2021 Feb 17.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

MoRFPred_en：使用集成学习策略基于序列预测莫尔费（MoRFs）。

MoRFPred_en: Sequence-based prediction of MoRFs using an ensemble learning strategy.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献