• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于改进天然-非天然蛋白质-蛋白质相互作用预测的集成分类器。

An Ensemble Classifiers for Improved Prediction of Native-Non-Native Protein-Protein Interaction.

机构信息

Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, Republic of Korea.

Department of Electrical Engineering, Telkom University, Bandung 40257, West Java, Indonesia.

出版信息

Int J Mol Sci. 2024 May 29;25(11):5957. doi: 10.3390/ijms25115957.

DOI:10.3390/ijms25115957
PMID:38892144
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11172808/
Abstract

In this study, we present an innovative approach to improve the prediction of protein-protein interactions (PPIs) through the utilization of an ensemble classifier, specifically focusing on distinguishing between native and non-native interactions. Leveraging the strengths of various base models, including random forest, gradient boosting, extreme gradient boosting, and light gradient boosting, our ensemble classifier integrates these diverse predictions using a logistic regression meta-classifier. Our model was evaluated using a comprehensive dataset generated from molecular dynamics simulations. While the gains in AUC and other metrics might seem modest, they contribute to a model that is more robust, consistent, and adaptable. To assess the effectiveness of various approaches, we compared the performance of logistic regression to four baseline models. Our results indicate that logistic regression consistently underperforms across all evaluated metrics. This suggests that it may not be well-suited to capture the complex relationships within this dataset. Tree-based models, on the other hand, appear to be more effective for problems involving molecular dynamics simulations. Extreme gradient boosting (XGBoost) and light gradient boosting (LightGBM) are optimized for performance and speed, handling datasets effectively and incorporating regularizations to avoid over-fitting. Our findings indicate that the ensemble method enhances the predictive capability of PPIs, offering a promising tool for computational biology and drug discovery by accurately identifying potential interaction sites and facilitating the understanding of complex protein functions within biological systems.

摘要

在这项研究中,我们提出了一种创新的方法,通过利用集成分类器来提高蛋白质-蛋白质相互作用(PPIs)的预测能力,特别是专注于区分天然和非天然相互作用。利用各种基础模型的优势,包括随机森林、梯度提升、极端梯度提升和轻梯度提升,我们的集成分类器使用逻辑回归元分类器整合了这些不同的预测。我们的模型使用从分子动力学模拟生成的综合数据集进行了评估。虽然 AUC 和其他指标的增益看起来微不足道,但它们有助于构建更稳健、一致和适应性强的模型。为了评估各种方法的有效性,我们将逻辑回归的性能与四个基线模型进行了比较。我们的结果表明,逻辑回归在所有评估指标上的表现都不一致。这表明它可能不适合捕捉这个数据集内部的复杂关系。另一方面,基于树的模型对于涉及分子动力学模拟的问题似乎更有效。极端梯度提升(XGBoost)和轻梯度提升(LightGBM)是为了性能和速度而优化的,能够有效地处理数据集,并包含正则化以避免过拟合。我们的研究结果表明,集成方法增强了 PPIs 的预测能力,通过准确识别潜在的相互作用位点,并促进对生物系统中复杂蛋白质功能的理解,为计算生物学和药物发现提供了一种有前途的工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e65a/11172808/bb786e6bb54c/ijms-25-05957-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e65a/11172808/2b26af4353b2/ijms-25-05957-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e65a/11172808/38929636bb8e/ijms-25-05957-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e65a/11172808/77639f6d1a10/ijms-25-05957-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e65a/11172808/bb786e6bb54c/ijms-25-05957-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e65a/11172808/2b26af4353b2/ijms-25-05957-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e65a/11172808/38929636bb8e/ijms-25-05957-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e65a/11172808/77639f6d1a10/ijms-25-05957-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e65a/11172808/bb786e6bb54c/ijms-25-05957-g004.jpg

相似文献

1
An Ensemble Classifiers for Improved Prediction of Native-Non-Native Protein-Protein Interaction.用于改进天然-非天然蛋白质-蛋白质相互作用预测的集成分类器。
Int J Mol Sci. 2024 May 29;25(11):5957. doi: 10.3390/ijms25115957.
2
Machine learning algorithms for outcome prediction in (chemo)radiotherapy: An empirical comparison of classifiers.机器学习算法在(放化疗)治疗结果预测中的应用:分类器的实证比较。
Med Phys. 2018 Jul;45(7):3449-3459. doi: 10.1002/mp.12967. Epub 2018 Jun 13.
3
Minimalist ensemble algorithms for genome-wide protein localization prediction.基因组范围内蛋白质定位预测的简约集成算法。
BMC Bioinformatics. 2012 Jul 3;13:157. doi: 10.1186/1471-2105-13-157.
4
DeepStack-DTIs: Predicting Drug-Target Interactions Using LightGBM Feature Selection and Deep-Stacked Ensemble Classifier.DeepStack-DTIs:使用 LightGBM 特征选择和深度堆叠集成分类器预测药物-靶标相互作用。
Interdiscip Sci. 2022 Jun;14(2):311-330. doi: 10.1007/s12539-021-00488-7. Epub 2021 Nov 3.
5
PeNGaRoo, a combined gradient boosting and ensemble learning framework for predicting non-classical secreted proteins.PeNGaRoo,一种组合梯度提升和集成学习框架,用于预测非经典分泌蛋白。
Bioinformatics. 2020 Feb 1;36(3):704-712. doi: 10.1093/bioinformatics/btz629.
6
Using a stacked ensemble learning framework to predict modulators of protein-protein interactions.使用堆叠集成学习框架预测蛋白质-蛋白质相互作用调节剂。
Comput Biol Med. 2023 Jul;161:107032. doi: 10.1016/j.compbiomed.2023.107032. Epub 2023 May 16.
7
Predicting protein-protein interactions between human and hepatitis C virus via an ensemble learning method.通过集成学习方法预测人类与丙型肝炎病毒之间的蛋白质-蛋白质相互作用。
Mol Biosyst. 2014 Dec;10(12):3147-54. doi: 10.1039/c4mb00410h. Epub 2014 Sep 18.
8
Improving protein-protein interactions prediction accuracy using XGBoost feature selection and stacked ensemble classifier.使用XGBoost特征选择和堆叠集成分类器提高蛋白质-蛋白质相互作用预测准确性。
Comput Biol Med. 2020 Aug;123:103899. doi: 10.1016/j.compbiomed.2020.103899. Epub 2020 Jul 15.
9
An intelligent model for prediction of abiotic stress-responsive microRNAs in plants using statistical moments based features and ensemble approaches.基于统计矩特征和集成方法的植物非生物胁迫响应 miRNA 预测智能模型。
Methods. 2024 Aug;228:65-79. doi: 10.1016/j.ymeth.2024.05.008. Epub 2024 May 18.
10
Artificial intelligence applications in allergic rhinitis diagnosis: Focus on ensemble learning.人工智能在变应性鼻炎诊断中的应用:聚焦集成学习。
Asia Pac Allergy. 2024 Jun;14(2):56-62. doi: 10.5415/apallergy.0000000000000126. Epub 2023 Dec 18.

引用本文的文献

1
Recent advances in deep learning for protein-protein interaction: a review.深度学习在蛋白质-蛋白质相互作用研究中的最新进展:综述
BioData Min. 2025 Jun 16;18(1):43. doi: 10.1186/s13040-025-00457-6.

本文引用的文献

1
Pinpointing top inhibitors for GSK3β from pool of indirubin derivatives using rigorous computational workflow and their validation using molecular dynamics (MD) simulations.利用严格的计算工作流程从靛玉红衍生物中鉴定出针对 GSK3β 的顶级抑制剂,并通过分子动力学(MD)模拟进行验证。
Sci Rep. 2024 Jan 2;14(1):49. doi: 10.1038/s41598-023-50992-7.
2
Protein-protein interaction site prediction by model ensembling with hybrid feature and self-attention.基于混合特征和自注意力的模型集成进行蛋白质-蛋白质相互作用位点预测。
BMC Bioinformatics. 2023 Dec 5;24(1):456. doi: 10.1186/s12859-023-05592-7.
3
Drug discovery by targeting the protein-protein interactions involved in autophagy.
通过靶向自噬过程中涉及的蛋白质-蛋白质相互作用来进行药物发现。
Acta Pharm Sin B. 2023 Nov;13(11):4373-4390. doi: 10.1016/j.apsb.2023.07.016. Epub 2023 Jul 20.
4
The Molecular Docking of MAX Fungal Effectors with Plant HMA Domain-Binding Proteins.MAX 真菌效应物与植物 HMA 结构域结合蛋白的分子对接。
Int J Mol Sci. 2023 Oct 16;24(20):15239. doi: 10.3390/ijms242015239.
5
Protein-protein interaction and site prediction using transfer learning.基于迁移学习的蛋白质-蛋白质相互作用和位点预测。
Brief Bioinform. 2023 Sep 22;24(6). doi: 10.1093/bib/bbad376.
6
Mechanisms and pathology of protein misfolding and aggregation.蛋白质错误折叠和聚集的机制和病理学。
Nat Rev Mol Cell Biol. 2023 Dec;24(12):912-933. doi: 10.1038/s41580-023-00647-2. Epub 2023 Sep 8.
7
Exploring New Therapeutic Avenues for Ophthalmic Disorders: Glaucoma-Related Molecular Docking Evaluation and Bibliometric Analysis for Improved Management of Ocular Diseases.探索眼科疾病的新治疗途径:青光眼相关分子对接评估及文献计量分析以改善眼部疾病管理
Bioengineering (Basel). 2023 Aug 20;10(8):983. doi: 10.3390/bioengineering10080983.
8
Common statistical concepts in the supervised Machine Learning arena.监督式机器学习领域中的常见统计概念。
Front Oncol. 2023 Feb 14;13:1130229. doi: 10.3389/fonc.2023.1130229. eCollection 2023.
9
ProtInteract: A deep learning framework for predicting protein-protein interactions.ProtInteract:一种用于预测蛋白质-蛋白质相互作用的深度学习框架。
Comput Struct Biotechnol J. 2023 Jan 25;21:1324-1348. doi: 10.1016/j.csbj.2023.01.028. eCollection 2023.
10
A random forest classifier for protein-protein docking models.一种用于蛋白质-蛋白质对接模型的随机森林分类器。
Bioinform Adv. 2021 Dec 10;2(1):vbab042. doi: 10.1093/bioadv/vbab042. eCollection 2022.