• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

ToxSTK:一种利用分子结构和堆叠集成学习的多靶点毒性评估方法。

ToxSTK: A multi-target toxicity assessment utilizing molecular structure and stacking ensemble learning.

作者信息

Boonsom Surapong, Chamnansil Panisara, Boonseng Sarote, Srisongkram Tarapong

机构信息

Department of Chemistry, Mahidol Wittayanusorn School, Phutthamonthon, Nakhon Pathom, Thailand.

Division of Pharmaceutical Chemistry, Faculty of Pharmaceutical Sciences, Khon Kaen University, Thailand.

出版信息

Comput Biol Med. 2025 Feb;185:109480. doi: 10.1016/j.compbiomed.2024.109480. Epub 2024 Dec 6.

DOI:10.1016/j.compbiomed.2024.109480
PMID:39644580
Abstract

Drug registration requires risk assessment of new active pharmaceutical ingredients or excipients to ensure they are safe for human health and the environment. However, traditional risk assessment is expensive and relies heavily on animal testing. Machine learning (ML) has been used as a risk assessment tool, providing less time, money, and involved animals than in vivo experiments. Despite that, the ML models often rely on a single model, which may introduce bias and unreliable prediction. Stacking ensemble learning is an ML framework that makes predictions based on multimodal outcomes. This framework performs well in quantitative structure-activity relationship (QSAR) studies. In this study, we developed ToxSTK, a multi-target toxicity assessment using stacking ensemble learning. We aimed to create an ML tool that facilitates toxicity assessments more affordably with reduced reliance on animal models. We focused on four key targets generally assessed in early-stage drug development: hERG toxicity, mTOR toxicity, PBMCs toxicity, and mutagenicity. Our model integrated 12 molecular fingerprints with 3 ML algorithms, generating 36 novel predictive features (PFs). These PFs were then combined to construct the final meta-decision model. Our results demonstrated that the ToxSTK model surpasses standard regression and classification metrics, ensuring it is highly reliable and accurate in predicting chemical toxicities within its application domain. This model passed the y-randomization test, confirming that the identified QSAR is robust and not due to random chance. Additionally, this model outperforms the existing ML methods for these endpoints, suggesting its effectiveness for risk assessment applications. We recommend incorporating this stacking ensemble learning framework into the chemical risk assessment pipeline to improve model generalization, accuracy, robustness, and reliability.

摘要

药物注册需要对新的活性药物成分或辅料进行风险评估,以确保它们对人类健康和环境是安全的。然而,传统的风险评估成本高昂,且严重依赖动物试验。机器学习(ML)已被用作一种风险评估工具,与体内实验相比,它所需的时间、金钱和涉及的动物更少。尽管如此,ML模型通常依赖单一模型,这可能会引入偏差和不可靠的预测。堆叠集成学习是一种基于多模态结果进行预测的ML框架。该框架在定量构效关系(QSAR)研究中表现良好。在本研究中,我们开发了ToxSTK,一种使用堆叠集成学习的多靶点毒性评估方法。我们旨在创建一种ML工具,以更经济实惠的方式促进毒性评估,同时减少对动物模型的依赖。我们专注于早期药物开发中通常评估的四个关键靶点:hERG毒性、mTOR毒性、外周血单核细胞(PBMCs)毒性和致突变性。我们的模型将12种分子指纹与3种ML算法相结合,生成了36个新的预测特征(PFs)。然后将这些PFs组合起来构建最终的元决策模型。我们的结果表明,ToxSTK模型超越了标准的回归和分类指标,确保其在预测其应用领域内的化学毒性方面高度可靠且准确。该模型通过了y随机化检验,证实所确定的QSAR是稳健的,并非偶然。此外,该模型在这些终点上优于现有的ML方法,表明其在风险评估应用中的有效性。我们建议将这种堆叠集成学习框架纳入化学风险评估流程,以提高模型的泛化能力、准确性、稳健性和可靠性。

相似文献

1
ToxSTK: A multi-target toxicity assessment utilizing molecular structure and stacking ensemble learning.ToxSTK:一种利用分子结构和堆叠集成学习的多靶点毒性评估方法。
Comput Biol Med. 2025 Feb;185:109480. doi: 10.1016/j.compbiomed.2024.109480. Epub 2024 Dec 6.
2
Predicting the reproductive toxicity of chemicals using ensemble learning methods and molecular fingerprints.利用集成学习方法和分子指纹预测化学品的生殖毒性。
Toxicol Lett. 2021 Apr 1;340:4-14. doi: 10.1016/j.toxlet.2021.01.002. Epub 2021 Jan 6.
3
Interpretable lung cancer risk prediction using ensemble learning and XAI based on lifestyle and demographic data.基于生活方式和人口统计学数据,使用集成学习和可解释人工智能进行可解释的肺癌风险预测。
Comput Biol Chem. 2025 Aug;117:108438. doi: 10.1016/j.compbiolchem.2025.108438. Epub 2025 Mar 27.
4
Toxicity prediction of small drug molecules of androgen receptor using multilevel ensemble model.使用多级集成模型预测雄激素受体小分子药物的毒性
J Bioinform Comput Biol. 2019 Oct;17(5):1950033. doi: 10.1142/S0219720019500331. Epub 2019 Oct 13.
5
A new hybrid ensemble machine-learning model for severity risk assessment and post-COVID prediction system.一种新的混合集成机器学习模型,用于严重程度风险评估和 COVID 后预测系统。
Math Biosci Eng. 2022 Apr 13;19(6):6102-6123. doi: 10.3934/mbe.2022285.
6
Ensemble learning with explainable AI for improved heart disease prediction based on multiple datasets.基于多个数据集,结合可解释人工智能的集成学习用于改善心脏病预测
Sci Rep. 2025 Apr 22;15(1):13912. doi: 10.1038/s41598-025-97547-6.
7
General Approach to Estimate Error Bars for Quantitative Structure-Activity Relationship Predictions of Molecular Activity.定量构效关系预测分子活性的误差估计的一般方法。
J Chem Inf Model. 2018 Aug 27;58(8):1561-1575. doi: 10.1021/acs.jcim.8b00114. Epub 2018 Jul 17.
8
Exploring Ensemble Learning Techniques for Infant Mortality Prediction: A Technical Analysis of XGBoost Stacking AdaBoost and Bagging Models.探索用于婴儿死亡率预测的集成学习技术:XGBoost、堆叠、AdaBoost和装袋模型的技术分析
Birth Defects Res. 2025 Feb;117(2):e2443. doi: 10.1002/bdr2.2443.
9
Heterogeneous ensemble learning for enhanced crash forecasts - A frequentist and machine learning based stacking framework.用于增强碰撞预测的异构集成学习——一种基于频率论和机器学习的堆叠框架。
J Safety Res. 2023 Feb;84:418-434. doi: 10.1016/j.jsr.2022.12.005. Epub 2022 Dec 14.
10
ARKA: a framework of dimensionality reduction for machine-learning classification modeling, risk assessment, and data gap-filling of sparse environmental toxicity data.ARKA:一种用于机器学习分类建模、风险评估和填补稀疏环境毒性数据的数据空白的降维框架。
Environ Sci Process Impacts. 2024 Jun 19;26(6):991-1007. doi: 10.1039/d4em00173g.

引用本文的文献

1
Multimodal Deep Learning for Generating Potential Anti-Dengue Peptides.用于生成潜在抗登革热肽的多模态深度学习
ACS Omega. 2025 Aug 19;10(34):38653-38674. doi: 10.1021/acsomega.5c03510. eCollection 2025 Sep 2.
2
StackNAFLD: An Accurate Stacking Ensemble Learning Targeting NAFLD Treatment.StackNAFLD:一种针对非酒精性脂肪性肝病治疗的精确堆叠集成学习方法。
ACS Omega. 2025 Aug 15;10(33):37096-37114. doi: 10.1021/acsomega.5c01473. eCollection 2025 Aug 26.
3
Mixture of experts for multitask learning in cardiotoxicity assessment.用于心脏毒性评估中多任务学习的专家混合模型。
J Cheminform. 2025 Aug 29;17(1):135. doi: 10.1186/s13321-025-01072-7.
4
SbD4Skin by EosCloud: Integrating multi-view molecular representation for predicting skin sensitization, irritation, and acute dermal toxicity.EosCloud公司的SbD4Skin:整合多视图分子表示法以预测皮肤致敏、刺激和急性皮肤毒性。
Comput Struct Biotechnol J. 2025 Aug 6;29:222-235. doi: 10.1016/j.csbj.2025.08.001. eCollection 2025.
5
Stacking Ensemble Neural Network for Chemical Safety Assessment: A Case Study of Thyroid Peroxidase and Natural Product Screening.用于化学安全评估的堆叠集成神经网络:以甲状腺过氧化物酶和天然产物筛选为例
ACS Omega. 2025 Jul 10;10(28):30450-30466. doi: 10.1021/acsomega.5c02188. eCollection 2025 Jul 22.
6
Using the Coefficient of Conformism of a Correlative Prediction in Simulation of Cardiotoxicity.在心脏毒性模拟中使用相关预测的一致性系数
Toxics. 2025 Apr 16;13(4):309. doi: 10.3390/toxics13040309.
7
Protecting your skin: a highly accurate LSTM network integrating conjoint features for predicting chemical-induced skin irritation.保护你的皮肤:一种集成联合特征的高精度长短期记忆网络,用于预测化学物质引起的皮肤刺激。
J Cheminform. 2025 Mar 27;17(1):39. doi: 10.1186/s13321-025-00980-y.
8
Bidirectional Long Short-Term Memory (BiLSTM) Neural Networks with Conjoint Fingerprints: Application in Predicting Skin-Sensitizing Agents in Natural Compounds.结合指纹的双向长短期记忆(BiLSTM)神经网络:在预测天然化合物中的皮肤致敏剂方面的应用。
J Chem Inf Model. 2025 Mar 24;65(6):3035-3047. doi: 10.1021/acs.jcim.5c00032. Epub 2025 Mar 3.