• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

化学相似性度量在非常规建模框架 c-RASAR 中的应用以及降维技术在具有代表性的肝毒性数据集上的应用。

The application of chemical similarity measures in an unconventional modeling framework c-RASAR along with dimensionality reduction techniques to a representative hepatotoxicity dataset.

机构信息

Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700 032, India.

出版信息

Sci Rep. 2024 Sep 6;14(1):20812. doi: 10.1038/s41598-024-71892-4.

DOI:10.1038/s41598-024-71892-4
PMID:39242880
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11379871/
Abstract

With the exponential progress in the field of cheminformatics, the conventional modeling approaches have so far been to employ supervised and unsupervised machine learning (ML) and deep learning models, utilizing the standard molecular descriptors, which represent the structural, physicochemical, and electronic properties of a particular compound. Deviating from the conventional approach, in this investigation, we have employed the classification Read-Across Structure-Activity Relationship (c-RASAR), which involves the amalgamation of the concepts of classification-based quantitative structure-activity relationship (QSAR) and Read-Across to incorporate Read-Across-derived similarity and error-based descriptors into a statistical and machine learning modeling framework. ML models developed from these RASAR descriptors use similarity-based information from the close source neighbors of a particular query compound. We have employed different classification modeling algorithms on the selected QSAR and RASAR descriptors to develop predictive models for efficient prediction of query compounds' hepatotoxicity. The predictivity of each of these models was evaluated on a large number of test set compounds. The best-performing model was also used to screen a true external data set. The concepts of explainable AI (XAI) coupled with Read-Across were used to interpret the contributions of the RASAR descriptors in the best c-RASAR model and to explain the chemical diversity in the dataset. The application of various unsupervised dimensionality reduction techniques like t-SNE and UMAP and the supervised ARKA framework showed the usefulness of the RASAR descriptors over the selected QSAR descriptors in their ability to group similar compounds, enhancing the modelability of the dataset and efficiently identifying activity cliffs. Furthermore, the activity cliffs were also identified from Read-Across by observing the nature of compounds constituting the nearest neighbors for a particular query compound. On comparing our simple linear c-RASAR model with the previously reported models developed using the same dataset derived from the US FDA Orange Book ( https://www.accessdata.fda.gov/scripts/cder/ob/index.cfm ), it was observed that our model is simple, reproducible, transferable, and highly predictive. The performance of the LDA c-RASAR model on the true external set supersedes that of the previously reported work. Therefore, the present simple LDA c-RASAR model can efficiently be used to predict the hepatotoxicity of query chemicals.

摘要

随着化学信息学领域的指数级发展,传统的建模方法迄今为止一直是利用有监督和无监督机器学习 (ML) 和深度学习模型,利用代表特定化合物结构、物理化学和电子特性的标准分子描述符。在这项研究中,我们采用了分类读跨结构-活性关系 (c-RASAR),这与传统方法不同,它涉及到将基于分类的定量结构-活性关系 (QSAR) 和读跨的概念结合起来,将读跨衍生的相似性和基于错误的描述符纳入统计和机器学习建模框架中。从这些 RASAR 描述符中开发的 ML 模型使用特定查询化合物的近源邻居的基于相似性的信息。我们在选定的 QSAR 和 RASAR 描述符上使用了不同的分类建模算法,以开发用于有效预测查询化合物肝毒性的预测模型。这些模型中的每一个都在大量测试集化合物上进行了预测性评估。表现最好的模型也被用于筛选一个真正的外部数据集。结合了可解释人工智能 (XAI) 的概念和读跨被用于解释最佳 c-RASAR 模型中 RASAR 描述符的贡献,并解释数据集的化学多样性。应用各种无监督降维技术,如 t-SNE 和 UMAP 以及监督的 ARKA 框架,显示了 RASAR 描述符在其将相似化合物分组的能力、增强数据集的可建模性和有效地识别活性悬崖方面优于选定的 QSAR 描述符的有用性。此外,通过观察构成特定查询化合物最近邻的化合物的性质,还可以从读跨中识别活性悬崖。将我们简单的线性 c-RASAR 模型与之前使用相同数据集(源自美国 FDA 橙皮书 ( https://www.accessdata.fda.gov/scripts/cder/ob/index.cfm ))开发的报告模型进行比较,结果表明我们的模型简单、可重复、可转移且具有高度预测性。LDA c-RASAR 模型在真实外部数据集上的性能优于之前的报告工作。因此,目前简单的 LDA c-RASAR 模型可以有效地用于预测查询化学品的肝毒性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a605/11379871/62a0f090a057/41598_2024_71892_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a605/11379871/213c7a0c0bdf/41598_2024_71892_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a605/11379871/a588d627e181/41598_2024_71892_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a605/11379871/ea0ce65d3a58/41598_2024_71892_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a605/11379871/63d0bb6ad47d/41598_2024_71892_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a605/11379871/49d83fe8ac38/41598_2024_71892_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a605/11379871/a6691de4553d/41598_2024_71892_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a605/11379871/b7e4be919d93/41598_2024_71892_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a605/11379871/efa66142aba9/41598_2024_71892_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a605/11379871/f7a7183aad0a/41598_2024_71892_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a605/11379871/1569c8d483c3/41598_2024_71892_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a605/11379871/c02359add836/41598_2024_71892_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a605/11379871/b77e901e16f0/41598_2024_71892_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a605/11379871/62a0f090a057/41598_2024_71892_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a605/11379871/213c7a0c0bdf/41598_2024_71892_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a605/11379871/a588d627e181/41598_2024_71892_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a605/11379871/ea0ce65d3a58/41598_2024_71892_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a605/11379871/63d0bb6ad47d/41598_2024_71892_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a605/11379871/49d83fe8ac38/41598_2024_71892_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a605/11379871/a6691de4553d/41598_2024_71892_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a605/11379871/b7e4be919d93/41598_2024_71892_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a605/11379871/efa66142aba9/41598_2024_71892_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a605/11379871/f7a7183aad0a/41598_2024_71892_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a605/11379871/1569c8d483c3/41598_2024_71892_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a605/11379871/c02359add836/41598_2024_71892_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a605/11379871/b77e901e16f0/41598_2024_71892_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a605/11379871/62a0f090a057/41598_2024_71892_Fig13_HTML.jpg

相似文献

1
The application of chemical similarity measures in an unconventional modeling framework c-RASAR along with dimensionality reduction techniques to a representative hepatotoxicity dataset.化学相似性度量在非常规建模框架 c-RASAR 中的应用以及降维技术在具有代表性的肝毒性数据集上的应用。
Sci Rep. 2024 Sep 6;14(1):20812. doi: 10.1038/s41598-024-71892-4.
2
Prediction-Inspired Intelligent Training for the Development of Classification Read-across Structure-Activity Relationship (c-RASAR) Models for Organic Skin Sensitizers: Assessment of Classification Error Rate from Novel Similarity Coefficients.基于预测的智能训练在分类读靶结构-活性关系(c-RASAR)模型开发中的应用:新型相似系数分类错误率评估。
Chem Res Toxicol. 2023 Sep 18;36(9):1518-1531. doi: 10.1021/acs.chemrestox.3c00155. Epub 2023 Aug 16.
3
First report of q-RASAR modeling toward an approach of easy interpretability and efficient transferability.首次报告 q-RASAR 建模,旨在实现易于解释和高效可迁移性的方法。
Mol Divers. 2022 Oct;26(5):2847-2862. doi: 10.1007/s11030-022-10478-6. Epub 2022 Jun 29.
4
Molecular Similarity in Predictive Toxicology with a Focus on the q-RASAR Technique.预测毒理学中的分子相似性研究——聚焦 q-RASAR 技术。
Methods Mol Biol. 2025;2834:41-63. doi: 10.1007/978-1-0716-4003-6_2.
5
On Some Novel Similarity-Based Functions Used in the ML-Based q-RASAR Approach for Efficient Quantitative Predictions of Selected Toxicity End Points.关于基于机器学习的q-RASAR方法中用于高效定量预测选定毒性终点的一些基于新颖相似性的函数。
Chem Res Toxicol. 2023 Mar 20;36(3):446-464. doi: 10.1021/acs.chemrestox.2c00374. Epub 2023 Feb 22.
6
Efficient predictions of cytotoxicity of TiO-based multi-component nanoparticles using a machine learning-based q-RASAR approach.使用基于机器学习的q-RASAR方法对TiO基多组分纳米颗粒的细胞毒性进行高效预测。
Nanotoxicology. 2023 Feb;17(1):78-93. doi: 10.1080/17435390.2023.2186280. Epub 2023 Mar 8.
7
ARKA: a framework of dimensionality reduction for machine-learning classification modeling, risk assessment, and data gap-filling of sparse environmental toxicity data.ARKA:一种用于机器学习分类建模、风险评估和填补稀疏环境毒性数据的数据空白的降维框架。
Environ Sci Process Impacts. 2024 Jun 19;26(6):991-1007. doi: 10.1039/d4em00173g.
8
Molecular similarity in chemical informatics and predictive toxicity modeling: from quantitative read-across (q-RA) to quantitative read-across structure-activity relationship (q-RASAR) with the application of machine learning.化学信息学和预测性毒理学建模中的分子相似性:从定量文献外推 (q-RA) 到基于机器学习的定量文献外推结构-活性关系 (q-RASAR)。
Crit Rev Toxicol. 2024 Oct;54(9):659-684. doi: 10.1080/10408444.2024.2386260. Epub 2024 Sep 3.
9
Machine learning - based q-RASAR modeling to predict acute contact toxicity of binary organic pesticide mixtures in honey bees.基于机器学习的 q-RASAR 模型预测二元有机农药混合物对蜜蜂的急性接触毒性。
J Hazard Mater. 2023 Oct 15;460:132358. doi: 10.1016/j.jhazmat.2023.132358. Epub 2023 Aug 22.
10
Quantitative read-across structure-activity relationship (q-RASAR): A novel approach to estimate the subchronic oral safety (NOAEL) of diverse organic chemicals in rats.定量结构-活性关系(q-RASAR):一种估计大鼠中多种有机化合物亚慢性口服安全性(NOAEL)的新方法。
Toxicology. 2024 Jun;505:153824. doi: 10.1016/j.tox.2024.153824. Epub 2024 May 4.

引用本文的文献

1
Machine learning assisted classification RASAR modeling for the nephrotoxicity potential of a curated set of orally active drugs.机器学习辅助分类RASAR模型用于一组精选口服活性药物的肾毒性潜力评估
Sci Rep. 2025 Jan 4;15(1):808. doi: 10.1038/s41598-024-85063-y.

本文引用的文献

1
Molecular similarity in chemical informatics and predictive toxicity modeling: from quantitative read-across (q-RA) to quantitative read-across structure-activity relationship (q-RASAR) with the application of machine learning.化学信息学和预测性毒理学建模中的分子相似性:从定量文献外推 (q-RA) 到基于机器学习的定量文献外推结构-活性关系 (q-RASAR)。
Crit Rev Toxicol. 2024 Oct;54(9):659-684. doi: 10.1080/10408444.2024.2386260. Epub 2024 Sep 3.
2
How to correctly develop q-RASAR models for predictive cheminformatics.如何为预测化学信息学正确开发q-RASAR模型。
Expert Opin Drug Discov. 2024 Sep;19(9):1017-1022. doi: 10.1080/17460441.2024.2376651. Epub 2024 Jul 5.
3
From molecular descriptors to the developmental toxicity prediction of pesticides/veterinary drugs/bio-pesticides against zebrafish embryo: Dual computational toxicological approaches for prioritization.
从分子描述符到针对斑马鱼胚胎的农药/兽药/生物农药发育毒性的预测:用于优先级排序的双重计算毒理学方法。
J Hazard Mater. 2024 Sep 5;476:134945. doi: 10.1016/j.jhazmat.2024.134945. Epub 2024 Jun 17.
4
ARKA: a framework of dimensionality reduction for machine-learning classification modeling, risk assessment, and data gap-filling of sparse environmental toxicity data.ARKA:一种用于机器学习分类建模、风险评估和填补稀疏环境毒性数据的数据空白的降维框架。
Environ Sci Process Impacts. 2024 Jun 19;26(6):991-1007. doi: 10.1039/d4em00173g.
5
Breaking the Barriers: Machine-Learning-Based c-RASAR Approach for Accurate Blood-Brain Barrier Permeability Prediction.突破壁垒:基于机器学习的 c-RASAR 方法实现精确的血脑屏障通透性预测。
J Chem Inf Model. 2024 May 27;64(10):4298-4309. doi: 10.1021/acs.jcim.4c00433. Epub 2024 May 3.
6
Chronic aquatic toxicity assessment of diverse chemicals on Daphnia magna using QSAR and chemical read-across.基于定量构效关系和化学相似性预测对大型溞的多种化学品的慢性水生毒性评估。
Regul Toxicol Pharmacol. 2024 Mar;148:105572. doi: 10.1016/j.yrtph.2024.105572. Epub 2024 Feb 5.
7
Prioritization of the ecotoxicological hazard of PAHs towards aquatic species spanning three trophic levels using 2D-QSTR, read-across and machine learning-driven modelling approaches.利用二维定量构效关系、读值预测和机器学习驱动的建模方法,对跨越三个营养级的水生物种的多环芳烃的生态毒理危害进行优先级排序。
J Hazard Mater. 2024 Mar 5;465:133410. doi: 10.1016/j.jhazmat.2023.133410. Epub 2024 Jan 2.
8
Fine-tuning GPT-3 for machine learning electronic and functional properties of organic molecules.针对机器学习中有机分子的电子和功能特性对GPT-3进行微调。
Chem Sci. 2023 Dec 5;15(2):500-510. doi: 10.1039/d3sc04610a. eCollection 2024 Jan 3.
9
Read-across-based intelligent learning: development of a global q-RASAR model for the efficient quantitative predictions of skin sensitization potential of diverse organic chemicals.基于读通的智能学习:用于高效定量预测多种有机化学品皮肤致敏潜力的全球 q-RASAR 模型的开发。
Environ Sci Process Impacts. 2023 Oct 18;25(10):1626-1644. doi: 10.1039/d3em00322a.
10
Machine learning - based q-RASAR modeling to predict acute contact toxicity of binary organic pesticide mixtures in honey bees.基于机器学习的 q-RASAR 模型预测二元有机农药混合物对蜜蜂的急性接触毒性。
J Hazard Mater. 2023 Oct 15;460:132358. doi: 10.1016/j.jhazmat.2023.132358. Epub 2023 Aug 22.