可解释的疾病生物标志物发现：以卵巢癌为例说明机器学习和 Shapley 分析的最佳实践。

Explainable discovery of disease biomarkers: The case of ovarian cancer to illustrate the best practice in machine learning and Shapley analysis.

机构信息

School of Computing, Australian National University, Acton, ACT 2601, Australia.

School of Computing, Australian National University, Acton, ACT 2601, Australia; Department of Computing, University of Turku, Turku, Finland.

出版信息

J Biomed Inform. 2023 May;141:104365. doi: 10.1016/j.jbi.2023.104365. Epub 2023 Apr 14.

DOI:10.1016/j.jbi.2023.104365

PMID:37062419

Abstract

OBJECTIVE

Ovarian cancer is a significant health issue with lasting impacts on the community. Despite recent advances in surgical, chemotherapeutic and radiotherapeutic interventions, they have had only marginal impacts due to an inability to identify biomarkers at an early stage. Biomarker discovery is challenging, yet essential for improving drug discovery and clinical care. Machine learning (ML) techniques are invaluable for recognising complex patterns in biomarkers compared to conventional methods, yet they can lack physical insights into diagnosis. eXplainable Artificial Intelligence (XAI) is capable of providing deeper insights into the decision-making of complex ML algorithms increasing their applicability. We aim to introduce best practice for combining ML and XAI techniques for biomarker validation tasks.

METHODS

We focused on classification tasks and a game theoretic approach based on Shapley values to build and evaluate models and visualise results. We described the workflow and apply the pipeline in a case study using the CDAS PLCO Ovarian Biomarkers dataset to demonstrate the potential for accuracy and utility.

RESULTS

The case study results demonstrate the efficacy of the ML pipeline, its consistency, and advantages compared to conventional statistical approaches.

CONCLUSION

The resulting guidelines provide a general framework for practical application of XAI in medical research that can inform clinicians and validate and explain cancer biomarkers.

摘要

目的

卵巢癌是一个严重的公共健康问题，对社区造成持久影响。尽管最近在手术、化疗和放疗干预方面取得了进展，但由于无法在早期识别生物标志物，这些进展的影响微乎其微。生物标志物的发现具有挑战性，但对于改善药物发现和临床护理至关重要。与传统方法相比，机器学习 (ML) 技术在识别生物标志物中的复杂模式方面具有无可估量的价值，但它们可能缺乏对诊断的物理洞察力。可解释的人工智能 (XAI) 能够更深入地了解复杂 ML 算法的决策过程，从而提高其适用性。我们旨在介绍结合 ML 和 XAI 技术进行生物标志物验证任务的最佳实践。

方法

我们专注于分类任务和基于 Shapley 值的博弈论方法来构建和评估模型并可视化结果。我们描述了工作流程，并在使用 CDAS PLCO 卵巢生物标志物数据集的案例研究中应用了该管道，以展示其在准确性和实用性方面的潜力。

结果

案例研究结果证明了 ML 管道的有效性、一致性和与传统统计方法相比的优势。

结论

得出的准则为医学研究中 XAI 的实际应用提供了一个通用框架，可为临床医生提供信息，并验证和解释癌症生物标志物。

相似文献

Explainable discovery of disease biomarkers: The case of ovarian cancer to illustrate the best practice in machine learning and Shapley analysis.

J Biomed Inform. 2023 May;141:104365. doi: 10.1016/j.jbi.2023.104365. Epub 2023 Apr 14.

Automated Machine Learning and Explainable AI (AutoML-XAI) for Metabolomics: Improving Cancer Diagnostics.

J Am Soc Mass Spectrom. 2024 Jun 5;35(6):1089-1100. doi: 10.1021/jasms.3c00403. Epub 2024 May 1.

Explainability and white box in drug discovery.

Chem Biol Drug Des. 2023 Jul;102(1):217-233. doi: 10.1111/cbdd.14262. Epub 2023 Apr 27.

Application of Artificial Intelligence for Preoperative Diagnostic and Prognostic Prediction in Epithelial Ovarian Cancer Based on Blood Biomarkers.

Clin Cancer Res. 2019 May 15;25(10):3006-3015. doi: 10.1158/1078-0432.CCR-18-3378. Epub 2019 Apr 11.

Explainable machine learning for breast cancer diagnosis from mammography and ultrasound images: a systematic review.

BMJ Health Care Inform. 2024 Feb 2;31(1):e100954. doi: 10.1136/bmjhci-2023-100954.

Applications of artificial intelligence (AI) in ovarian cancer, pancreatic cancer, and image biomarker discovery.

Cancer Biomark. 2022;33(2):173-184. doi: 10.3233/CBM-210301.

Explainable Artificial Intelligence in Ambulatory Digital Dementia Screenings.

Stud Health Technol Inform. 2022 May 25;294:123-124. doi: 10.3233/SHTI220411.

Application of machine learning techniques for predicting survival in ovarian cancer.

BMC Med Inform Decis Mak. 2022 Dec 30;22(1):345. doi: 10.1186/s12911-022-02087-y.

Explainable artificial intelligence models using real-world electronic health record data: a systematic scoping review.

J Am Med Inform Assoc. 2020 Jul 1;27(7):1173-1185. doi: 10.1093/jamia/ocaa053.

Integrative gene expression analysis for the diagnosis of Parkinson's disease using machine learning and explainable AI.

Comput Biol Med. 2023 Sep;163:107140. doi: 10.1016/j.compbiomed.2023.107140. Epub 2023 Jun 8.

引用本文的文献

AQP9 and IFITM1 as drivers of immune infiltration and tumor progression in IBD-associated colorectal cancer: from computational insights to experimental validation.

Naunyn Schmiedebergs Arch Pharmacol. 2025 Jul 7. doi: 10.1007/s00210-025-04362-x.

Machine learning in ovarian cancer: a bibliometric and visual analysis from 2004 to 2024.

Discov Oncol. 2025 May 13;16(1):755. doi: 10.1007/s12672-025-02416-3.

Rigorous validation of machine learning in laboratory medicine: guidance toward quality improvement.

Crit Rev Clin Lab Sci. 2025 Aug;62(5):327-346. doi: 10.1080/10408363.2025.2488842. Epub 2025 Apr 17.

Explainable AI-based feature importance analysis for ovarian cancer classification with ensemble methods.

Front Public Health. 2025 Mar 26;13:1479095. doi: 10.3389/fpubh.2025.1479095. eCollection 2025.

Analysis of cultivated land degradation in southern China: diagnostics, drivers, and restoration solutions.

Front Plant Sci. 2025 Feb 19;16:1533855. doi: 10.3389/fpls.2025.1533855. eCollection 2025.

Demystifying the black box: A survey on explainable artificial intelligence (XAI) in bioinformatics.

Comput Struct Biotechnol J. 2025 Jan 10;27:346-359. doi: 10.1016/j.csbj.2024.12.027. eCollection 2025.

Advancing thyroid care: An accurate trustworthy diagnostics system with interpretable AI and hybrid machine learning techniques.

Heliyon. 2024 Aug 20;10(17):e36556. doi: 10.1016/j.heliyon.2024.e36556. eCollection 2024 Sep 15.

Innovative approach towards early prediction of ovarian cancer: Machine learning- enabled XAI techniques.

Heliyon. 2024 Apr 15;10(9):e29197. doi: 10.1016/j.heliyon.2024.e29197. eCollection 2024 May 15.

Role of artificial intelligence in predicting neurological outcomes in postcardiac resuscitation.

Ann Med Surg (Lond). 2024 Oct 22;86(12):7202-7211. doi: 10.1097/MS9.0000000000002673. eCollection 2024 Dec.

Diagnostic Power and Reproducibility of Objective Perimetry in Glaucoma.

J Glaucoma. 2024 Dec 1;33(12):940-950. doi: 10.1097/IJG.0000000000002485. Epub 2024 Aug 23.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

可解释的疾病生物标志物发现：以卵巢癌为例说明机器学习和 Shapley 分析的最佳实践。

Explainable discovery of disease biomarkers: The case of ovarian cancer to illustrate the best practice in machine learning and Shapley analysis.

机构信息

出版信息

OBJECTIVE

METHODS

RESULTS

CONCLUSION

目的

方法

结果

结论

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献