• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

机器学习驱动的预测毒理学中的权衡预测性和可解释性:使用 Tox21 数据集的深入研究。

Trade-off Predictivity and Explainability for Machine-Learning Powered Predictive Toxicology: An in-Depth Investigation with Tox21 Data Sets.

机构信息

Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, FDA, 3900 NCTR Road, Jefferson, Arkansas 72079, United States.

Division of Preclinical Innovation, National Center for Advancing Translational Sciences, National Institutes of Health, 9800 Medical Center Drive, Rockville, Maryland 20850, United States.

出版信息

Chem Res Toxicol. 2021 Feb 15;34(2):541-549. doi: 10.1021/acs.chemrestox.0c00373. Epub 2021 Jan 29.

DOI:10.1021/acs.chemrestox.0c00373
PMID:33513003
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8867471/
Abstract

Selecting a model in predictive toxicology often involves a trade-off between prediction performance and explainability: should we sacrifice the model performance to gain explainability or vice versa. Here we present a comprehensive study to assess algorithm and feature influences on model performance in chemical toxicity research. We conducted over 5000 models for a Tox21 bioassay data set of 65 assays and ∼7600 compounds. Seven molecular representations as features and 12 modeling approaches varying in complexity and explainability were employed to systematically investigate the impact of various factors on model performance and explainability. We demonstrated that end points dictated a model's performance, regardless of the chosen modeling approach including deep learning and chemical features. Overall, more complex models such as (LS-)SVM and Random Forest performed marginally better than simpler models such as linear regression and KNN in the presented Tox21 data analysis. Since a simpler model with acceptable performance often also is easy to interpret for the Tox21 data set, it clearly was the preferred choice due to its better explainability. Given that each data set had its own error structure both for dependent and independent variables, we strongly recommend that it is important to conduct a systematic study with a broad range of model complexity and feature explainability to identify model balancing its predictivity and explainability.

摘要

在预测毒理学中选择模型通常需要在预测性能和可解释性之间进行权衡

我们是否应该牺牲模型性能来获得可解释性,或者反之亦然。在这里,我们进行了一项全面的研究,以评估算法和特征对化学毒性研究中模型性能的影响。我们针对 Tox21 生物测定数据集(包含 65 个测定和约 7600 种化合物)进行了超过 5000 个模型的构建。我们使用了七种分子表示作为特征,并采用了 12 种在复杂性和可解释性方面有所不同的建模方法,以系统地研究各种因素对模型性能和可解释性的影响。我们证明,终点决定了模型的性能,而与所选择的建模方法(包括深度学习和化学特征)无关。总体而言,在呈现的 Tox21 数据分析中,诸如(最小二乘)支持向量机和随机森林等更复杂的模型比线性回归和 KNN 等更简单的模型的性能略有提高。由于对于 Tox21 数据集而言,具有可接受性能的简单模型通常也更容易解释,因此它显然是首选,因为它具有更好的可解释性。鉴于每个数据集都有其自己的误差结构,无论是对于因变量还是自变量,我们强烈建议进行具有广泛模型复杂性和特征可解释性的系统研究,以确定平衡预测能力和可解释性的模型。

相似文献

1
Trade-off Predictivity and Explainability for Machine-Learning Powered Predictive Toxicology: An in-Depth Investigation with Tox21 Data Sets.机器学习驱动的预测毒理学中的权衡预测性和可解释性:使用 Tox21 数据集的深入研究。
Chem Res Toxicol. 2021 Feb 15;34(2):541-549. doi: 10.1021/acs.chemrestox.0c00373. Epub 2021 Jan 29.
2
PERform: assessing model performance with predictivity and explainability readiness formula.PERform:使用可预测性和可解释性准备公式评估模型性能。
J Environ Sci Health C Toxicol Carcinog. 2024;42(4):298-313. doi: 10.1080/26896583.2024.2340391. Epub 2024 Apr 15.
3
BoostDILI: Extreme Gradient Boost-Powered Drug-Induced Liver Injury Prediction and Structural Alerts Generation.BoostDILI:基于极端梯度提升的药物性肝损伤预测与结构警示生成
Chem Res Toxicol. 2025 May 19;38(5):865-876. doi: 10.1021/acs.chemrestox.4c00532. Epub 2025 Apr 16.
4
Identification of Optimal Machine Learning Algorithms and Molecular Fingerprints for Explainable Toxicity Prediction Models Using ToxCast/Tox21 Bioassay Data.利用ToxCast/Tox21生物测定数据确定用于可解释毒性预测模型的最佳机器学习算法和分子指纹
ACS Omega. 2024 Aug 27;9(36):37934-37941. doi: 10.1021/acsomega.4c04474. eCollection 2024 Sep 10.
5
Can Predictive Modeling Tools Identify Patients at High Risk of Prolonged Opioid Use After ACL Reconstruction?预测模型工具能否识别 ACL 重建术后阿片类药物使用时间延长的高风险患者?
Clin Orthop Relat Res. 2020 Jul;478(7):0-1618. doi: 10.1097/CORR.0000000000001251.
6
The Price of Explainability in Machine Learning Models for 100-Day Readmission Prediction in Heart Failure: Retrospective, Comparative, Machine Learning Study.机器学习模型在心力衰竭 100 天再入院预测中的可解释性代价:回顾性、比较性、机器学习研究。
J Med Internet Res. 2023 Oct 27;25:e46934. doi: 10.2196/46934.
7
Identifying Protein Features and Pathways Responsible for Toxicity Using Machine Learning and Tox21: Implications for Predictive Toxicology.利用机器学习和 Tox21 识别导致毒性的蛋白质特征和途径:对预测毒理学的启示。
Molecules. 2022 May 8;27(9):3021. doi: 10.3390/molecules27093021.
8
COVID-Net Biochem: an explainability-driven framework to building machine learning models for predicting survival and kidney injury of COVID-19 patients from clinical and biochemistry data.COVID-Net 生化:一个基于可解释性的框架,用于构建基于临床和生化数据预测 COVID-19 患者生存和肾脏损伤的机器学习模型。
Sci Rep. 2023 Oct 9;13(1):17001. doi: 10.1038/s41598-023-42203-0.
9
Machine learning approaches for prediction of early death among lung cancer patients with bone metastases using routine clinical characteristics: An analysis of 19,887 patients.利用常规临床特征预测肺癌伴骨转移患者早期死亡的机器学习方法:对 19887 例患者的分析。
Front Public Health. 2022 Oct 6;10:1019168. doi: 10.3389/fpubh.2022.1019168. eCollection 2022.
10
Predicting the Toxicity of Drug Molecules with Selecting Effective Descriptors Using a Binary Ant Colony Optimization (BACO) Feature Selection Approach.使用二元蚁群优化(BACO)特征选择方法选择有效描述符来预测药物分子的毒性
Molecules. 2025 Mar 31;30(7):1548. doi: 10.3390/molecules30071548.

引用本文的文献

1
Novel target identification towards drug repurposing based on biological activity profiles.基于生物活性谱的药物再利用新靶点识别
PLoS One. 2025 May 6;20(5):e0319865. doi: 10.1371/journal.pone.0319865. eCollection 2025.
2
Improved Interpretability Without Performance Reduction in a Sepsis Prediction Risk Score.在脓毒症预测风险评分中提高可解释性且不降低性能
Diagnostics (Basel). 2025 Jan 28;15(3):307. doi: 10.3390/diagnostics15030307.
3
Identification of Optimal Machine Learning Algorithms and Molecular Fingerprints for Explainable Toxicity Prediction Models Using ToxCast/Tox21 Bioassay Data.利用ToxCast/Tox21生物测定数据确定用于可解释毒性预测模型的最佳机器学习算法和分子指纹
ACS Omega. 2024 Aug 27;9(36):37934-37941. doi: 10.1021/acsomega.4c04474. eCollection 2024 Sep 10.
4
Artificial intelligence integration in the drug lifecycle and in regulatory science: policy implications, challenges and opportunities.人工智能在药物生命周期和监管科学中的整合:政策影响、挑战与机遇。
Front Pharmacol. 2024 Aug 2;15:1437167. doi: 10.3389/fphar.2024.1437167. eCollection 2024.
5
PERform: assessing model performance with predictivity and explainability readiness formula.PERform:使用可预测性和可解释性准备公式评估模型性能。
J Environ Sci Health C Toxicol Carcinog. 2024;42(4):298-313. doi: 10.1080/26896583.2024.2340391. Epub 2024 Apr 15.
6
Interpreting drug synergy in breast cancer with deep learning using target-protein inhibition profiles.利用目标蛋白抑制谱,通过深度学习解读乳腺癌中的药物协同作用。
BioData Min. 2024 Feb 29;17(1):8. doi: 10.1186/s13040-024-00359-z.
7
Explainable machine learning for breast cancer diagnosis from mammography and ultrasound images: a systematic review.从乳腺 X 光和超声图像进行乳腺癌诊断的可解释机器学习:系统综述。
BMJ Health Care Inform. 2024 Feb 2;31(1):e100954. doi: 10.1136/bmjhci-2023-100954.
8
VenomPred 2.0: A Novel Platform for an Extended and Human Interpretable Toxicological Profiling of Small Molecules.毒液预测 2.0:一种新型平台,用于对小分子进行扩展和人类可解释的毒理学分析。
J Chem Inf Model. 2024 Apr 8;64(7):2275-2289. doi: 10.1021/acs.jcim.3c00692. Epub 2023 Sep 7.
9
Evaluating the utility of a high throughput thiol-containing fluorescent probe to screen for reactivity: A case study with the Tox21 library.评估一种高通量含硫醇荧光探针用于筛选反应性的效用:以Tox21文库为例的研究
Comput Toxicol. 2023 May;26. doi: 10.1016/j.comtox.2023.100271.
10
Guidance for good practice in the application of machine learning in development of toxicological quantitative structure-activity relationships (QSARs).机器学习在毒理学定量构效关系(QSARs)开发中的应用良好实践指南。
PLoS One. 2023 May 10;18(5):e0282924. doi: 10.1371/journal.pone.0282924. eCollection 2023.

本文引用的文献

1
Descriptor Free QSAR Modeling Using Deep Learning With Long Short-Term Memory Neural Networks.使用长短期记忆神经网络的深度学习进行无描述符定量构效关系建模
Front Artif Intell. 2019 Sep 6;2:17. doi: 10.3389/frai.2019.00017. eCollection 2019.
2
From Big Data to Artificial Intelligence: chemoinformatics meets new challenges.从大数据到人工智能:化学信息学面临新挑战。
J Cheminform. 2020 Dec 18;12(1):74. doi: 10.1186/s13321-020-00475-y.
3
The Tox21 10K Compound Library: Collaborative Chemistry Advancing Toxicology.Tox21 十库化合物库:协作化学推动毒理学发展。
Chem Res Toxicol. 2021 Feb 15;34(2):189-216. doi: 10.1021/acs.chemrestox.0c00264. Epub 2020 Nov 3.
4
Integrating adverse outcome pathways (AOPs) and high throughput in vitro assays for better risk evaluations, a study with drug-induced liver injury (DILI).整合不良结局途径(AOP)和高通量体外检测方法以更好地进行风险评估,以药物性肝损伤(DILI)为例的研究。
ALTEX. 2020;37(2):187-196. doi: 10.14573/altex.1908151. Epub 2019 Nov 8.
5
Drug-induced liver injury severity and toxicity (DILIst): binary classification of 1279 drugs by human hepatotoxicity.药物性肝损伤严重程度和毒性(DILIst):1279 种药物的人类肝毒性的二进制分类。
Drug Discov Today. 2020 Jan;25(1):201-208. doi: 10.1016/j.drudis.2019.09.022. Epub 2019 Nov 1.
6
Novel Consensus Architecture To Improve Performance of Large-Scale Multitask Deep Learning QSAR Models.新型共识架构可提高大规模多任务深度学习 QSAR 模型的性能。
J Chem Inf Model. 2019 Nov 25;59(11):4613-4624. doi: 10.1021/acs.jcim.9b00526. Epub 2019 Oct 25.
7
Toxicogenomics: A 2020 Vision.毒理基因组学:2020 年展望。
Trends Pharmacol Sci. 2019 Feb;40(2):92-103. doi: 10.1016/j.tips.2018.12.001. Epub 2018 Dec 26.
8
Comparative Study of Multitask Toxicity Modeling on a Broad Chemical Space.多任务毒性建模在广阔化学空间上的比较研究。
J Chem Inf Model. 2019 Mar 25;59(3):1062-1072. doi: 10.1021/acs.jcim.8b00685. Epub 2019 Jan 23.
9
Editorial: Integrative Toxicogenomics: Analytical Strategies to Amalgamate Exposure Effects With Genomic Sciences.社论:整合毒理基因组学:将暴露效应与基因组科学相结合的分析策略
Front Genet. 2018 Nov 27;9:563. doi: 10.3389/fgene.2018.00563. eCollection 2018.
10
A Survey of Multi-task Learning Methods in Chemoinformatics.化学信息学中多任务学习方法研究综述
Mol Inform. 2019 Apr;38(4):e1800108. doi: 10.1002/minf.201800108. Epub 2018 Nov 28.