• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于序列特征选择技术的蛋白质甲基化位点快速预测。

Fast Prediction of Protein Methylation Sites Using a Sequence-Based Feature Selection Technique.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2019 Jul-Aug;16(4):1264-1273. doi: 10.1109/TCBB.2017.2670558. Epub 2017 Feb 16.

DOI:10.1109/TCBB.2017.2670558
PMID:28222000
Abstract

Protein methylation, an important post-translational modification, plays crucial roles in many cellular processes. The accurate prediction of protein methylation sites is fundamentally important for revealing the molecular mechanisms undergoing methylation. In recent years, computational prediction based on machine learning algorithms has emerged as a powerful and robust approach for identifying methylation sites, and much progress has been made in predictive performance improvement. However, the predictive performance of existing methods is not satisfactory in terms of overall accuracy. Motivated by this, we propose a novel random-forest-based predictor called MePred-RF, integrating several discriminative sequence-based feature descriptors and improving feature representation capability using a powerful feature selection technique. Importantly, unlike other methods based on multiple, complex information inputs, our proposed MePred-RF is based on sequence information alone. Comparative studies on benchmark datasets via vigorous jackknife tests indicate that our proposed MePred-RF method remarkably outperforms other state-of-the-art predictors, leading by a 4.5 percent average in terms of overall accuracy. A user-friendly webserver that implements the proposed method has been established for researchers' convenience, and is now freely available for public use through http://server.malab.cn/MePred-RF. We anticipate our research tool to be useful for the large-scale prediction and analysis of protein methylation sites.

摘要

蛋白质甲基化是一种重要的翻译后修饰,在许多细胞过程中发挥着关键作用。准确预测蛋白质甲基化位点对于揭示发生甲基化的分子机制至关重要。近年来,基于机器学习算法的计算预测已成为识别甲基化位点的一种强大而稳健的方法,在提高预测性能方面取得了很大进展。然而,现有方法的预测性能在整体准确性方面并不令人满意。受此启发,我们提出了一种称为 MePred-RF 的新型基于随机森林的预测器,该预测器集成了几种有鉴别力的基于序列的特征描述符,并使用强大的特征选择技术提高了特征表示能力。重要的是,与其他基于多个复杂信息输入的方法不同,我们提出的 MePred-RF 仅基于序列信息。通过强力折刀测试对基准数据集进行的比较研究表明,我们提出的 MePred-RF 方法显著优于其他最先进的预测器,总体准确率平均提高了 4.5%。为了方便研究人员,我们建立了一个用户友好的 Web 服务器来实现所提出的方法,现在可以通过 http://server.malab.cn/MePred-RF 免费供公众使用。我们预计我们的研究工具将有助于大规模预测和分析蛋白质甲基化位点。

相似文献

1
Fast Prediction of Protein Methylation Sites Using a Sequence-Based Feature Selection Technique.基于序列特征选择技术的蛋白质甲基化位点快速预测。
IEEE/ACM Trans Comput Biol Bioinform. 2019 Jul-Aug;16(4):1264-1273. doi: 10.1109/TCBB.2017.2670558. Epub 2017 Feb 16.
2
CPPred-RF: A Sequence-based Predictor for Identifying Cell-Penetrating Peptides and Their Uptake Efficiency.CPPred-RF:一种基于序列的用于识别细胞穿透肽及其摄取效率的预测工具。
J Proteome Res. 2017 May 5;16(5):2044-2053. doi: 10.1021/acs.jproteome.7b00019. Epub 2017 Apr 26.
3
PhosPred-RF: A Novel Sequence-Based Predictor for Phosphorylation Sites Using Sequential Information Only.PhosPred-RF:一种仅使用序列信息的基于序列的磷酸化位点新型预测工具。
IEEE Trans Nanobioscience. 2017 Jun;16(4):240-247. doi: 10.1109/TNB.2017.2661756. Epub 2017 Jan 31.
4
Exploring sequence-based features for the improved prediction of DNA N4-methylcytosine sites in multiple species.探索基于序列的特征,以提高在多个物种中预测 DNA N4-甲基胞嘧啶位点的能力。
Bioinformatics. 2019 Apr 15;35(8):1326-1333. doi: 10.1093/bioinformatics/bty824.
5
predCar-site: Carbonylation sites prediction in proteins using support vector machine with resolving data imbalanced issue.predCar-site:使用支持向量机预测蛋白质中的羰基化位点并解决数据不平衡问题。
Anal Biochem. 2017 May 15;525:107-113. doi: 10.1016/j.ab.2017.03.008. Epub 2017 Mar 9.
6
Enhanced Protein Fold Prediction Method Through a Novel Feature Extraction Technique.通过一种新型特征提取技术增强蛋白质折叠预测方法
IEEE Trans Nanobioscience. 2015 Sep;14(6):649-59. doi: 10.1109/TNB.2015.2450233.
7
DeepSSPred: A Deep Learning Based Sulfenylation Site Predictor Via a Novel nSegmented Optimize Federated Feature Encoder.DeepSSPred:一种基于深度学习的新型 nSegmented Optimize 联邦特征编码器的硫化位点预测器。
Protein Pept Lett. 2021;28(6):708-721. doi: 10.2174/0929866527666201202103411.
8
Meta-iPVP: a sequence-based meta-predictor for improving the prediction of phage virion proteins using effective feature representation.Meta-iPVP:一种基于序列的元预测器,用于使用有效的特征表示来改进噬菌体衣壳蛋白的预测。
J Comput Aided Mol Des. 2020 Oct;34(10):1105-1116. doi: 10.1007/s10822-020-00323-z. Epub 2020 Jun 16.
9
A novel sequence-based method for phosphorylation site prediction with feature selection and analysis.一种基于序列的新型磷酸化位点预测方法,具有特征选择与分析功能。
Protein Pept Lett. 2012 Jan;19(1):70-8. doi: 10.2174/092986612798472893.
10
TargetDBP: Accurate DNA-Binding Protein Prediction Via Sequence-Based Multi-View Feature Learning.目标 DBP:基于序列的多视图特征学习的准确 DNA 结合蛋白预测。
IEEE/ACM Trans Comput Biol Bioinform. 2020 Jul-Aug;17(4):1419-1429. doi: 10.1109/TCBB.2019.2893634. Epub 2019 Jan 18.

引用本文的文献

1
T4Seeker: a hybrid model for type IV secretion effectors identification.T4Seeker:一种用于 IV 型分泌效应器识别的混合模型。
BMC Biol. 2024 Nov 14;22(1):259. doi: 10.1186/s12915-024-02064-z.
2
MSlocPRED: deep transfer learning-based identification of multi-label mRNA subcellular localization.MSlocPRED:基于深度迁移学习的多标签 mRNA 亚细胞定位识别。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae504.
3
Current Development of Data Resources and Bioinformatics Tools for Anticoronavirus Peptide.抗冠状病毒肽的数据资源和生物信息学工具的当前发展。
Curr Med Chem. 2024;31(26):4079-4099. doi: 10.2174/0109298673264218231121104407.
4
Analysis and review of techniques and tools based on machine learning and deep learning for prediction of lysine malonylation sites in protein sequences.基于机器学习和深度学习的赖氨酸丙二酰化位点预测的技术和工具的分析与综述。
Database (Oxford). 2024 Jan 19;2024. doi: 10.1093/database/baad094.
5
MIND-S is a deep-learning prediction model for elucidating protein post-translational modifications in human diseases.MIND-S 是一种用于阐明人类疾病中蛋白质翻译后修饰的深度学习预测模型。
Cell Rep Methods. 2023 Mar 27;3(3):100430. doi: 10.1016/j.crmeth.2023.100430.
6
Arginine Methylation of the PGC-1α C-Terminus Is Temperature-Dependent.PGC-1α C 端精氨酸甲基化是温度依赖性的。
Biochemistry. 2023 Jan 3;62(1):22-34. doi: 10.1021/acs.biochem.2c00363. Epub 2022 Dec 19.
7
CNNArginineMe: A CNN structure for training models for predicting arginine methylation sites based on the One-Hot encoding of peptide sequence.CNN精氨酸甲基化预测模型:一种基于肽序列独热编码训练预测精氨酸甲基化位点模型的卷积神经网络结构。
Front Genet. 2022 Oct 17;13:1036862. doi: 10.3389/fgene.2022.1036862. eCollection 2022.
8
Thirty years of molecular dynamics simulations on posttranslational modifications of proteins.蛋白质翻译后修饰三十年的分子动力学模拟
Phys Chem Chem Phys. 2022 Nov 9;24(43):26371-26397. doi: 10.1039/d2cp02883b.
9
A SNARE Protein Identification Method Based on iLearnPlus to Efficiently Solve the Data Imbalance Problem.一种基于iLearnPlus的SNARE蛋白识别方法,可有效解决数据不平衡问题。
Front Genet. 2022 Jan 28;12:818841. doi: 10.3389/fgene.2021.818841. eCollection 2021.
10
NetNMSP: Nonoverlapping maximal sequential pattern mining.NetNMSP:非重叠最大顺序模式挖掘。
Appl Intell (Dordr). 2022;52(9):9861-9884. doi: 10.1007/s10489-021-02912-3. Epub 2022 Jan 10.