• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

XGBFEMF:基于 XGBoost 的必需蛋白预测框架。

XGBFEMF: An XGBoost-Based Framework for Essential Protein Prediction.

出版信息

IEEE Trans Nanobioscience. 2018 Jul;17(3):243-250. doi: 10.1109/TNB.2018.2842219. Epub 2018 May 31.

DOI:10.1109/TNB.2018.2842219
PMID:29993553
Abstract

Essential proteins as a vital part of maintaining the cells' life play an important role in the study of biology and drug design. With the generation of large amounts of biological data related to essential proteins, an increasing number of computational methods have been proposed. Different from the methods which adopt a single machine learning method or an ensemble machine learning method, this paper proposes a predicting framework named by XGBFEMF for identifying essential proteins, which includes a SUB-EXPAND-SHRINK method for constructing the composite features with original features and obtaining the better subset of features for essential protein prediction, and also includes a model fusion method for getting a more effective prediction model. We carry out experiments on Yeast data to assess the performance of the XGBFEMF with ROC analysis, accuracy analysis, and top analysis. Meanwhile, we set up experiments on E. coli data for the validation of performance. The test results show that the XGBFEMF framework can effectively improve many essential indicators. In addition, we analyze each step in the XGBFEMF framework; our results show that both each step of the SUB-EXPAND-SHRINK method as well as the step of multi-model fusion can improve prediction performance.

摘要

必需蛋白作为维持细胞生命的重要组成部分,在生物学和药物设计研究中发挥着重要作用。随着与必需蛋白相关的大量生物数据的产生,越来越多的计算方法被提出来。与采用单一机器学习方法或集成机器学习方法的方法不同,本文提出了一种名为 XGBFEMF 的预测框架,用于识别必需蛋白,该框架包括 SUB-EXPAND-SHRINK 方法,用于构建原始特征和获得更好的必需蛋白预测特征子集的复合特征,还包括模型融合方法,以获得更有效的预测模型。我们在酵母数据上进行实验,通过 ROC 分析、准确性分析和顶部分析来评估 XGBFEMF 的性能。同时,我们在大肠杆菌数据上进行实验,以验证性能。实验结果表明,XGBFEMF 框架可以有效地提高许多必需指标。此外,我们分析了 XGBFEMF 框架中的每一步;我们的结果表明,SUB-EXPAND-SHRINK 方法的每一步以及多模型融合的步骤都可以提高预测性能。

相似文献

1
XGBFEMF: An XGBoost-Based Framework for Essential Protein Prediction.XGBFEMF:基于 XGBoost 的必需蛋白预测框架。
IEEE Trans Nanobioscience. 2018 Jul;17(3):243-250. doi: 10.1109/TNB.2018.2842219. Epub 2018 May 31.
2
Automated feature engineering improves prediction of protein-protein interactions.自动化特征工程提高蛋白质-蛋白质相互作用预测的准确性。
Amino Acids. 2019 Aug;51(8):1187-1200. doi: 10.1007/s00726-019-02756-9. Epub 2019 Jul 5.
3
A framework for incorporating functional interrelationships into protein function prediction algorithms.将功能相互关系纳入蛋白质功能预测算法的框架。
IEEE/ACM Trans Comput Biol Bioinform. 2012 May-Jun;9(3):740-53. doi: 10.1109/TCBB.2011.148.
4
Machine Learning Approaches for Protein⁻Protein Interaction Hot Spot Prediction: Progress and Comparative Assessment.机器学习方法在蛋白质-蛋白质相互作用热点预测中的应用:进展与比较评估。
Molecules. 2018 Oct 4;23(10):2535. doi: 10.3390/molecules23102535.
5
Minimalist ensemble algorithms for genome-wide protein localization prediction.基因组范围内蛋白质定位预测的简约集成算法。
BMC Bioinformatics. 2012 Jul 3;13:157. doi: 10.1186/1471-2105-13-157.
6
RVMAB: Using the Relevance Vector Machine Model Combined with Average Blocks to Predict the Interactions of Proteins from Protein Sequences.RVMAB:使用相关向量机模型结合平均块从蛋白质序列预测蛋白质相互作用
Int J Mol Sci. 2016 May 18;17(5):757. doi: 10.3390/ijms17050757.
7
Prediction of protein-protein interaction sites using an ensemble method.利用集成方法预测蛋白质-蛋白质相互作用位点。
BMC Bioinformatics. 2009 Dec 16;10:426. doi: 10.1186/1471-2105-10-426.
8
Accurate prediction of protein-protein interactions by integrating potential evolutionary information embedded in PSSM profile and discriminative vector machine classifier.通过整合PSSM概况中嵌入的潜在进化信息和判别向量机分类器来准确预测蛋白质-蛋白质相互作用。
Oncotarget. 2017 Apr 4;8(14):23638-23649. doi: 10.18632/oncotarget.15564.
9
PreSPI: a domain combination based prediction system for protein-protein interaction.PreSPI:一种基于结构域组合的蛋白质-蛋白质相互作用预测系统。
Nucleic Acids Res. 2004 Dec 1;32(21):6312-20. doi: 10.1093/nar/gkh972. Print 2004.
10
PCVMZM: Using the Probabilistic Classification Vector Machines Model Combined with a Zernike Moments Descriptor to Predict Protein-Protein Interactions from Protein Sequences.PCVMZM:使用概率分类向量机模型结合泽尼克矩描述符从蛋白质序列预测蛋白质-蛋白质相互作用
Int J Mol Sci. 2017 May 11;18(5):1029. doi: 10.3390/ijms18051029.

引用本文的文献

1
Prediction and optimization of stretch flangeability of advanced high strength steels utilizing machine learning approaches.利用机器学习方法对先进高强度钢拉伸翻边性能进行预测与优化。
Sci Rep. 2025 May 10;15(1):16296. doi: 10.1038/s41598-025-00786-w.
2
An automated classification pipeline for tables in pharmacokinetic literature.药代动力学文献中表格的自动分类流程
Sci Rep. 2025 Mar 24;15(1):10071. doi: 10.1038/s41598-025-94778-5.
3
EPI-SF: essential protein identification in protein interaction networks using sequence features.
EPI-SF:利用序列特征在蛋白质相互作用网络中进行必需蛋白质鉴定。
PeerJ. 2024 Mar 13;12:e17010. doi: 10.7717/peerj.17010. eCollection 2024.
4
A predictive model for postoperative adverse outcomes following surgical treatment of acute type A aortic dissection based on machine learning.基于机器学习的急性 A 型主动脉夹层手术治疗后不良结局的预测模型。
J Clin Hypertens (Greenwich). 2024 Mar;26(3):251-261. doi: 10.1111/jch.14774. Epub 2024 Feb 11.
5
ECDEP: identifying essential proteins based on evolutionary community discovery and subcellular localization.ECDEP:基于进化群落发现和亚细胞定位识别必需蛋白质
BMC Genomics. 2024 Jan 26;25(1):117. doi: 10.1186/s12864-024-10019-5.
6
iMRSAPred: Improved Prediction of Anti-MRSA Peptides Using Physicochemical and Pairwise Contact-Energy Properties of Amino Acids.iMRSAPred:利用氨基酸的物理化学性质和成对接触能特性改进抗耐甲氧西林金黄色葡萄球菌肽的预测
ACS Omega. 2024 Jan 3;9(2):2874-2883. doi: 10.1021/acsomega.3c08303. eCollection 2024 Jan 16.
7
Integrating somatic mutation profiles with structural deep clustering network for metabolic stratification in pancreatic cancer: a comprehensive analysis of prognostic and genomic landscapes.将体细胞突变谱与结构深度聚类网络相结合进行胰腺癌代谢分层:预后和基因组特征的综合分析。
Brief Bioinform. 2023 Nov 22;25(1). doi: 10.1093/bib/bbad430.
8
Method for Classifying Schizophrenia Patients Based on Machine Learning.基于机器学习的精神分裂症患者分类方法
J Clin Med. 2023 Jun 29;12(13):4375. doi: 10.3390/jcm12134375.
9
Machine learning prognosis model based on patient-reported outcomes for chronic heart failure patients after discharge.基于患者报告结局的慢性心力衰竭患者出院后机器学习预后模型。
Health Qual Life Outcomes. 2023 Mar 29;21(1):31. doi: 10.1186/s12955-023-02109-x.
10
PreAcrs: a machine learning framework for identifying anti-CRISPR proteins.预 Acrs:一种用于识别抗 CRISPR 蛋白的机器学习框架。
BMC Bioinformatics. 2022 Oct 25;23(1):444. doi: 10.1186/s12859-022-04986-3.