机器学习辅助分类RASAR模型用于一组精选口服活性药物的肾毒性潜力评估

Machine learning assisted classification RASAR modeling for the nephrotoxicity potential of a curated set of orally active drugs.

作者信息

Banerjee Arkaprava, Roy Kunal

机构信息

Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700 032, India.

出版信息

Sci Rep. 2025 Jan 4;15(1):808. doi: 10.1038/s41598-024-85063-y.

DOI:10.1038/s41598-024-85063-y

PMID:39755865

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11700179/

Abstract

We have adopted the classification Read-Across Structure-Activity Relationship (c-RASAR) approach in the present study for machine-learning (ML)-based model development from a recently reported curated dataset of nephrotoxicity potential of orally active drugs. We initially developed ML models using nine different algorithms separately on topological descriptors (referred to as simply "descriptors" in the subsequent sections of the manuscript) and MACCS fingerprints (referred to as "fingerprints" in the subsequent sections of the manuscript), thus generating 18 different ML QSAR models. Using the chemical spaces defined by the modeling descriptors and fingerprints, the similarity and error-based RASAR descriptors were computed, and the most discriminating RASAR descriptors were used to develop another set of 18 different ML c-RASAR models. All 36 models were cross-validated 20 times with a fivefold cross-validation strategy, and their predictivity was checked on the test set data. A multi-criteria decision-making strategy - the Sum of Ranking Differences (SRD) approach-was adopted to identify the best-performing model based on robustness and external validation parameters. This statistical analysis suggested that the c-RASAR models had an overall good performance, while the best-performing model was also a c-RASAR model (LDA c-RASAR model derived from topological descriptors, with MCC values of 0.229 and 0.431 for the training and test sets, respectively). This model was used to screen a true external data set prepared from the known nephrotoxic compounds of DrugBankDB, demonstrating good predictivity.

摘要

在本研究中，我们采用了分类读通结构-活性关系（c-RASAR）方法，基于最近报告的口服活性药物肾毒性潜力的精选数据集进行机器学习（ML）模型开发。我们最初分别使用九种不同算法，基于拓扑描述符（在手稿后续部分简称为“描述符”）和MACCS指纹（在手稿后续部分简称为“指纹”）开发ML模型，从而生成18种不同的ML QSAR模型。利用建模描述符和指纹定义的化学空间，计算了基于相似性和误差的RASAR描述符，并使用最具区分性的RASAR描述符开发了另一组18种不同的ML c-RASAR模型。所有36个模型均采用五折交叉验证策略进行了20次交叉验证，并在测试集数据上检查了它们的预测能力。采用了一种多标准决策策略——排名差异总和（SRD）方法——基于稳健性和外部验证参数来识别性能最佳的模型。该统计分析表明，c-RASAR模型总体表现良好，而性能最佳的模型也是一个c-RASAR模型（源自拓扑描述符的LDA c-RASAR模型，训练集和测试集的MCC值分别为0.229和0.431）。该模型用于筛选从DrugBankDB的已知肾毒性化合物制备的真实外部数据集，显示出良好的预测能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8fd4/11700179/15deba262123/41598_2024_85063_Fig1_HTML.jpg

相似文献

Machine learning assisted classification RASAR modeling for the nephrotoxicity potential of a curated set of orally active drugs.机器学习辅助分类RASAR模型用于一组精选口服活性药物的肾毒性潜力评估

Sci Rep. 2025 Jan 4;15(1):808. doi: 10.1038/s41598-024-85063-y.

The application of chemical similarity measures in an unconventional modeling framework c-RASAR along with dimensionality reduction techniques to a representative hepatotoxicity dataset.化学相似性度量在非常规建模框架 c-RASAR 中的应用以及降维技术在具有代表性的肝毒性数据集上的应用。

Sci Rep. 2024 Sep 6;14(1):20812. doi: 10.1038/s41598-024-71892-4.

The multiclass ARKA framework for developing improved q-RASAR models for environmental toxicity endpoints.用于开发改进的环境毒性终点q-RASAR模型的多类ARKA框架。

Environ Sci Process Impacts. 2025 May 21;27(5):1229-1243. doi: 10.1039/d5em00068h.

Prediction-Inspired Intelligent Training for the Development of Classification Read-across Structure-Activity Relationship (c-RASAR) Models for Organic Skin Sensitizers: Assessment of Classification Error Rate from Novel Similarity Coefficients.基于预测的智能训练在分类读靶结构-活性关系（c-RASAR）模型开发中的应用：新型相似系数分类错误率评估。

Chem Res Toxicol. 2023 Sep 18;36(9):1518-1531. doi: 10.1021/acs.chemrestox.3c00155. Epub 2023 Aug 16.

Efficient predictions of cytotoxicity of TiO-based multi-component nanoparticles using a machine learning-based q-RASAR approach.使用基于机器学习的q-RASAR方法对TiO基多组分纳米颗粒的细胞毒性进行高效预测。

Nanotoxicology. 2023 Feb;17(1):78-93. doi: 10.1080/17435390.2023.2186280. Epub 2023 Mar 8.

On Some Novel Similarity-Based Functions Used in the ML-Based q-RASAR Approach for Efficient Quantitative Predictions of Selected Toxicity End Points.关于基于机器学习的q-RASAR方法中用于高效定量预测选定毒性终点的一些基于新颖相似性的函数。

Chem Res Toxicol. 2023 Mar 20;36(3):446-464. doi: 10.1021/acs.chemrestox.2c00374. Epub 2023 Feb 22.

Machine learning - based q-RASAR modeling to predict acute contact toxicity of binary organic pesticide mixtures in honey bees.基于机器学习的 q-RASAR 模型预测二元有机农药混合物对蜜蜂的急性接触毒性。

J Hazard Mater. 2023 Oct 15;460:132358. doi: 10.1016/j.jhazmat.2023.132358. Epub 2023 Aug 22.

Quantitative read-across structure-activity relationship (q-RASAR): A novel approach to estimate the subchronic oral safety (NOAEL) of diverse organic chemicals in rats.定量结构-活性关系（q-RASAR）：一种估计大鼠中多种有机化合物亚慢性口服安全性（NOAEL）的新方法。

Toxicology. 2024 Jun;505:153824. doi: 10.1016/j.tox.2024.153824. Epub 2024 May 4.

Machine learning-based q-RASAR predictions of the bioconcentration factor of organic molecules estimated following the organisation for economic co-operation and development guideline 305.基于机器学习的 q-RASAR 预测有机分子的生物浓缩因子，该预测方法是按照经济合作与发展组织的指南 305 进行估算的。

J Hazard Mater. 2024 Nov 5;479:135725. doi: 10.1016/j.jhazmat.2024.135725. Epub 2024 Sep 3.

First report of q-RASAR modeling toward an approach of easy interpretability and efficient transferability.首次报告 q-RASAR 建模，旨在实现易于解释和高效可迁移性的方法。

Mol Divers. 2022 Oct;26(5):2847-2862. doi: 10.1007/s11030-022-10478-6. Epub 2022 Jun 29.

引用本文的文献

Artificial Intelligence-Driven Drug Toxicity Prediction: Advances, Challenges, and Future Directions.人工智能驱动的药物毒性预测：进展、挑战与未来方向。

Toxics. 2025 Jun 23;13(7):525. doi: 10.3390/toxics13070525.

本文引用的文献

A bibliometric analysis of the Cheminformatics/QSAR literature (2000-2023) for predictive modeling in data science using the SCOPUS database.使用Scopus数据库对2000年至2023年化学信息学/定量构效关系文献进行文献计量分析，以用于数据科学中的预测建模。

Mol Divers. 2024 Dec 5. doi: 10.1007/s11030-024-11056-8.

The round-robin approach applied to nanoinformatics: consensus prediction of nanomaterials zeta potential.应用于纳米信息学的循环赛方法：纳米材料zeta电位的共识预测

Beilstein J Nanotechnol. 2024 Nov 29;15:1536-1553. doi: 10.3762/bjnano.15.121. eCollection 2024.

Sci Rep. 2024 Sep 6;14(1):20812. doi: 10.1038/s41598-024-71892-4.

Crit Rev Toxicol. 2024 Oct;54(9):659-684. doi: 10.1080/10408444.2024.2386260. Epub 2024 Sep 3.

How to correctly develop q-RASAR models for predictive cheminformatics.如何为预测化学信息学正确开发q-RASAR模型。

Expert Opin Drug Discov. 2024 Sep;19(9):1017-1022. doi: 10.1080/17460441.2024.2376651. Epub 2024 Jul 5.

From molecular descriptors to the developmental toxicity prediction of pesticides/veterinary drugs/bio-pesticides against zebrafish embryo: Dual computational toxicological approaches for prioritization.从分子描述符到针对斑马鱼胚胎的农药/兽药/生物农药发育毒性的预测：用于优先级排序的双重计算毒理学方法。

J Hazard Mater. 2024 Sep 5;476:134945. doi: 10.1016/j.jhazmat.2024.134945. Epub 2024 Jun 17.

ARKA: a framework of dimensionality reduction for machine-learning classification modeling, risk assessment, and data gap-filling of sparse environmental toxicity data.ARKA：一种用于机器学习分类建模、风险评估和填补稀疏环境毒性数据的数据空白的降维框架。

Environ Sci Process Impacts. 2024 Jun 19;26(6):991-1007. doi: 10.1039/d4em00173g.

Breaking the Barriers: Machine-Learning-Based c-RASAR Approach for Accurate Blood-Brain Barrier Permeability Prediction.突破壁垒：基于机器学习的 c-RASAR 方法实现精确的血脑屏障通透性预测。

J Chem Inf Model. 2024 May 27;64(10):4298-4309. doi: 10.1021/acs.jcim.4c00433. Epub 2024 May 3.

Prediction of acute toxicity for Chlorella vulgaris caused by tire wear particle-derived compounds using quantitative structure-activity relationship models.利用定量构效关系模型预测轮胎磨损颗粒衍生化合物对小球藻的急性毒性。

Water Res. 2024 Jun 1;256:121643. doi: 10.1016/j.watres.2024.121643. Epub 2024 Apr 18.

Generation of a drug-induced renal injury list to facilitate the development of new approach methodologies for nephrotoxicity.生成一个药物诱导的肾损伤列表，以促进新的肾毒性方法学的发展。

Drug Discov Today. 2024 Apr;29(4):103938. doi: 10.1016/j.drudis.2024.103938. Epub 2024 Mar 1.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

机器学习辅助分类RASAR模型用于一组精选口服活性药物的肾毒性潜力评估

Machine learning assisted classification RASAR modeling for the nephrotoxicity potential of a curated set of orally active drugs.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献