Suppr超能文献

ToxSTK:一种利用分子结构和堆叠集成学习的多靶点毒性评估方法。

ToxSTK: A multi-target toxicity assessment utilizing molecular structure and stacking ensemble learning.

作者信息

Boonsom Surapong, Chamnansil Panisara, Boonseng Sarote, Srisongkram Tarapong

机构信息

Department of Chemistry, Mahidol Wittayanusorn School, Phutthamonthon, Nakhon Pathom, Thailand.

Division of Pharmaceutical Chemistry, Faculty of Pharmaceutical Sciences, Khon Kaen University, Thailand.

出版信息

Comput Biol Med. 2025 Feb;185:109480. doi: 10.1016/j.compbiomed.2024.109480. Epub 2024 Dec 6.

Abstract

Drug registration requires risk assessment of new active pharmaceutical ingredients or excipients to ensure they are safe for human health and the environment. However, traditional risk assessment is expensive and relies heavily on animal testing. Machine learning (ML) has been used as a risk assessment tool, providing less time, money, and involved animals than in vivo experiments. Despite that, the ML models often rely on a single model, which may introduce bias and unreliable prediction. Stacking ensemble learning is an ML framework that makes predictions based on multimodal outcomes. This framework performs well in quantitative structure-activity relationship (QSAR) studies. In this study, we developed ToxSTK, a multi-target toxicity assessment using stacking ensemble learning. We aimed to create an ML tool that facilitates toxicity assessments more affordably with reduced reliance on animal models. We focused on four key targets generally assessed in early-stage drug development: hERG toxicity, mTOR toxicity, PBMCs toxicity, and mutagenicity. Our model integrated 12 molecular fingerprints with 3 ML algorithms, generating 36 novel predictive features (PFs). These PFs were then combined to construct the final meta-decision model. Our results demonstrated that the ToxSTK model surpasses standard regression and classification metrics, ensuring it is highly reliable and accurate in predicting chemical toxicities within its application domain. This model passed the y-randomization test, confirming that the identified QSAR is robust and not due to random chance. Additionally, this model outperforms the existing ML methods for these endpoints, suggesting its effectiveness for risk assessment applications. We recommend incorporating this stacking ensemble learning framework into the chemical risk assessment pipeline to improve model generalization, accuracy, robustness, and reliability.

摘要

药物注册需要对新的活性药物成分或辅料进行风险评估,以确保它们对人类健康和环境是安全的。然而,传统的风险评估成本高昂,且严重依赖动物试验。机器学习(ML)已被用作一种风险评估工具,与体内实验相比,它所需的时间、金钱和涉及的动物更少。尽管如此,ML模型通常依赖单一模型,这可能会引入偏差和不可靠的预测。堆叠集成学习是一种基于多模态结果进行预测的ML框架。该框架在定量构效关系(QSAR)研究中表现良好。在本研究中,我们开发了ToxSTK,一种使用堆叠集成学习的多靶点毒性评估方法。我们旨在创建一种ML工具,以更经济实惠的方式促进毒性评估,同时减少对动物模型的依赖。我们专注于早期药物开发中通常评估的四个关键靶点:hERG毒性、mTOR毒性、外周血单核细胞(PBMCs)毒性和致突变性。我们的模型将12种分子指纹与3种ML算法相结合,生成了36个新的预测特征(PFs)。然后将这些PFs组合起来构建最终的元决策模型。我们的结果表明,ToxSTK模型超越了标准的回归和分类指标,确保其在预测其应用领域内的化学毒性方面高度可靠且准确。该模型通过了y随机化检验,证实所确定的QSAR是稳健的,并非偶然。此外,该模型在这些终点上优于现有的ML方法,表明其在风险评估应用中的有效性。我们建议将这种堆叠集成学习框架纳入化学风险评估流程,以提高模型的泛化能力、准确性、稳健性和可靠性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验