使用机器学习并基于共识方法对多变量化学危害终点进行定量构效关系分类建模

QSAR Classification Modeling Using Machine Learning with a Consensus-Based Approach for Multivariate Chemical Hazard End Points.

作者信息

Fuadah Yunendah Nur, Pramudito Muhammad Adnan, Firdaus Lulu, Vanheusden Frederique J, Lim Ki Moo

机构信息

Computational Medicine Lab, Department of IT Convergence Engineering, Kumoh National Institute of Technology, Gumi 39177, Republic of Korea.

School of Electrical Engineering, Telkom University, Bandung 40257, Indonesia.

出版信息

ACS Omega. 2024 Dec 12;9(51):50796-50808. doi: 10.1021/acsomega.4c09356. eCollection 2024 Dec 24.

DOI:10.1021/acsomega.4c09356

PMID:39741811

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11683616/

Abstract

This study introduces an innovative computational approach using hybrid machine learning models to predict toxicity across eight critical end points: cardiac toxicity, inhalation toxicity, dermal toxicity, oral toxicity, skin irritation, skin sensitization, eye irritation, and respiratory irritation. Leveraging advanced cheminformatics tools, we extracted relevant features from curated data sets, incorporating a range of descriptors such as Morgan circular fingerprints, MACCS keys, Mordred calculation descriptors, and physicochemical properties. The consensus model was developed by selecting the best-performing classifier-Random Forest (RF), eXtreme Gradient Boosting (XGBoost), or Support Vector Machines (SVM)-for each descriptor, optimizing predictive accuracy and robustness across the end points. The model obtained strong predictive performance, with area under the curve (AUC) scores ranging from 0.78 to 0.90. This framework offers a reliable, ethical, and effective in silico approach to chemical safety assessment, underscoring the potential of advanced computational methods to support both regulatory and research applications in toxicity prediction.

摘要

本研究介绍了一种创新的计算方法，该方法使用混合机器学习模型来预测八个关键终点的毒性：心脏毒性、吸入毒性、皮肤毒性、口服毒性、皮肤刺激性、皮肤致敏性、眼睛刺激性和呼吸道刺激性。利用先进的化学信息学工具，我们从经过整理的数据集中提取了相关特征，纳入了一系列描述符，如摩根圆形指纹、MACCS键、Mordred计算描述符和物理化学性质。通过为每个描述符选择性能最佳的分类器——随机森林（RF）、极端梯度提升（XGBoost）或支持向量机（SVM），开发了共识模型，优化了各终点的预测准确性和稳健性。该模型获得了强大的预测性能，曲线下面积（AUC）分数在0.78至0.90之间。该框架为化学安全评估提供了一种可靠、符合伦理且有效的计算机模拟方法，突出了先进计算方法在支持毒性预测的监管和研究应用方面的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0127/11683616/b4f8b8095f99/ao4c09356_0001.jpg

相似文献

QSAR Classification Modeling Using Machine Learning with a Consensus-Based Approach for Multivariate Chemical Hazard End Points.

ACS Omega. 2024 Dec 12;9(51):50796-50808. doi: 10.1021/acsomega.4c09356. eCollection 2024 Dec 24.

HDAC3i-Finder: A Machine Learning-based Computational Tool to Screen for HDAC3 Inhibitors.

Mol Inform. 2021 Mar;40(3):e2000105. doi: 10.1002/minf.202000105. Epub 2020 Nov 23.

Do we need different machine learning algorithms for QSAR modeling? A comprehensive assessment of 16 machine learning algorithms on 14 QSAR data sets.

Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa321.

Machine-learning based prediction models for assessing skin irritation and corrosion potential of liquid chemicals using physicochemical properties by XGBoost.

Toxicol Res. 2023 Jan 23;39(2):295-305. doi: 10.1007/s43188-022-00168-8. eCollection 2023 Apr.

Can Predictive Modeling Tools Identify Patients at High Risk of Prolonged Opioid Use After ACL Reconstruction?

Clin Orthop Relat Res. 2020 Jul;478(7):0-1618. doi: 10.1097/CORR.0000000000001251.

ADMET Evaluation in Drug Discovery. Part 17: Development of Quantitative and Qualitative Prediction Models for Chemical-Induced Respiratory Toxicity.

Mol Pharm. 2017 Jul 3;14(7):2407-2421. doi: 10.1021/acs.molpharmaceut.7b00317. Epub 2017 Jun 21.

ADMET Evaluation in Drug Discovery. 18. Reliable Prediction of Chemical-Induced Urinary Tract Toxicity by Boosting Machine Learning Approaches.

Mol Pharm. 2017 Nov 6;14(11):3935-3953. doi: 10.1021/acs.molpharmaceut.7b00631. Epub 2017 Oct 27.

ADMET evaluation in drug discovery: 15. Accurate prediction of rat oral acute toxicity using relevance vector machine and consensus modeling.

J Cheminform. 2016 Feb 1;8:6. doi: 10.1186/s13321-016-0117-7. eCollection 2016.

Development and validation of a prediction model for coronary heart disease risk in depressed patients aged 20 years and older using machine learning algorithms.

Front Cardiovasc Med. 2025 Jan 9;11:1504957. doi: 10.3389/fcvm.2024.1504957. eCollection 2024.

HDAC3_VS_assistant: cheminformatics-driven discovery of histone deacetylase 3 inhibitors.

Mol Divers. 2024 Dec 23. doi: 10.1007/s11030-024-11066-6.

引用本文的文献

SbD4Skin by EosCloud: Integrating multi-view molecular representation for predicting skin sensitization, irritation, and acute dermal toxicity.

Comput Struct Biotechnol J. 2025 Aug 6;29:222-235. doi: 10.1016/j.csbj.2025.08.001. eCollection 2025.

Machine Learning-Driven Consensus Modeling for Activity Ranking and Chemical Landscape Analysis of HIV-1 Inhibitors.

Pharmaceuticals (Basel). 2025 May 13;18(5):714. doi: 10.3390/ph18050714.

Advancements in toxicological risk assessment: integrating Ferguson's principle, computational models, and drug safety guidelines, a comprehensive framework for improving risk assessment and resource management in toxicology.

Toxicol Res (Camb). 2025 May 4;14(3):tfaf065. doi: 10.1093/toxres/tfaf065. eCollection 2025 Jun.

Semi-Correlations for the Simulation of Dermal Toxicity.

Toxics. 2025 Mar 23;13(4):235. doi: 10.3390/toxics13040235.

本文引用的文献

In Silico Prediction of Oral Acute Rodent Toxicity Using Consensus Machine Learning.

J Chem Inf Model. 2024 Apr 22;64(8):3114-3122. doi: 10.1021/acs.jcim.4c00056. Epub 2024 Mar 18.

Prediction of Chemical Acute Dermal Toxicity Using Explainable Machine Learning Methods.

Chem Res Toxicol. 2024 Mar 18;37(3):513-524. doi: 10.1021/acs.chemrestox.4c00012. Epub 2024 Feb 21.

Machine learning approach to evaluate TdP risk of drugs using cardiac electrophysiological model including inter-individual variability.

Front Physiol. 2023 Oct 4;14:1266084. doi: 10.3389/fphys.2023.1266084. eCollection 2023.

Practical guidelines for the use of gradient boosting for molecular property prediction.

J Cheminform. 2023 Aug 28;15(1):73. doi: 10.1186/s13321-023-00743-7.

Comparing LD/LC Machine Learning Models for Multiple Species.

J Chem Health Saf. 2023 Mar 27;30(2):83-97. doi: 10.1021/acs.chas.2c00088. Epub 2023 Feb 23.

In silico prediction of hERG blockers using machine learning and deep learning approaches.

J Appl Toxicol. 2023 Oct;43(10):1462-1475. doi: 10.1002/jat.4477. Epub 2023 May 6.

Machine-learning based prediction models for assessing skin irritation and corrosion potential of liquid chemicals using physicochemical properties by XGBoost.

Toxicol Res. 2023 Jan 23;39(2):295-305. doi: 10.1007/s43188-022-00168-8. eCollection 2023 Apr.

Machine learning using the extreme gradient boosting (XGBoost) algorithm predicts 5-day delta of SOFA score at ICU admission in COVID-19 patients.

J Intensive Med. 2021 Oct 22;1(2):110-116. doi: 10.1016/j.jointm.2021.09.002. eCollection 2021 Oct.

Investigating cardiotoxicity related with hERG channel blockers using molecular fingerprints and graph attention mechanism.

Comput Biol Med. 2023 Feb;153:106464. doi: 10.1016/j.compbiomed.2022.106464. Epub 2022 Dec 20.

Prediction of drug-induced liver injury and cardiotoxicity using chemical structure and in vitro assay data.

Toxicol Appl Pharmacol. 2022 Nov 1;454:116250. doi: 10.1016/j.taap.2022.116250. Epub 2022 Sep 20.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用机器学习并基于共识方法对多变量化学危害终点进行定量构效关系分类建模

QSAR Classification Modeling Using Machine Learning with a Consensus-Based Approach for Multivariate Chemical Hazard End Points.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献