Suppr超能文献

用于优化肝细胞癌诊断的代谢组学生物标志物发现:整合自动机器学习和可解释人工智能的方法

Metabolomics Biomarker Discovery to Optimize Hepatocellular Carcinoma Diagnosis: Methodology Integrating AutoML and Explainable Artificial Intelligence.

作者信息

Yagin Fatma Hilal, El Shawi Radwa, Algarni Abdulmohsen, Colak Cemil, Al-Hashem Fahaid, Ardigò Luca Paolo

机构信息

Department of Biostatistics, and Medical Informatics, Faculty of Medicine, Inonu University, 44280 Malatya, Turkey.

Institute of Computer Science, Tartu University, 51009 Tartu, Estonia.

出版信息

Diagnostics (Basel). 2024 Sep 15;14(18):2049. doi: 10.3390/diagnostics14182049.

Abstract

This study aims to assess the efficacy of combining automated machine learning (AutoML) and explainable artificial intelligence (XAI) in identifying metabolomic biomarkers that can differentiate between hepatocellular carcinoma (HCC) and liver cirrhosis in patients with hepatitis C virus (HCV) infection. We investigated publicly accessible data encompassing HCC patients and cirrhotic controls. The TPOT tool, which is an AutoML tool, was used to optimize the preparation of features and data, as well as to select the most suitable machine learning model. The TreeSHAP approach, which is a type of XAI, was used to interpret the model by assessing each metabolite's individual contribution to the categorization process. TPOT had superior performance in distinguishing between HCC and cirrhosis compared to other AutoML approaches AutoSKlearn and H2O AutoML, in addition to traditional machine learning models such as random forest, support vector machine, and k-nearest neighbor. The TPOT technique attained an AUC value of 0.81, showcasing superior accuracy, sensitivity, and specificity in comparison to the other models. Key metabolites, including L-valine, glycine, and DL-isoleucine, were identified as essential by TPOT and subsequently verified by TreeSHAP analysis. TreeSHAP provided a comprehensive explanation of the contribution of these metabolites to the model's predictions, thereby increasing the interpretability and dependability of the results. This thorough assessment highlights the strength and reliability of the AutoML framework in the development of clinical biomarkers. This study shows that AutoML and XAI can be used together to create metabolomic biomarkers that are specific to HCC. The exceptional performance of TPOT in comparison to traditional models highlights its capacity to identify biomarkers. Furthermore, TreeSHAP boosted model transparency by highlighting the relevance of certain metabolites. This comprehensive method has the potential to enhance the identification of biomarkers and generate precise, easily understandable, AI-driven solutions for diagnosing HCC.

摘要

本研究旨在评估将自动化机器学习(AutoML)和可解释人工智能(XAI)相结合,在识别可区分丙型肝炎病毒(HCV)感染患者肝细胞癌(HCC)和肝硬化的代谢组学生物标志物方面的功效。我们调查了公开可得的包含HCC患者和肝硬化对照的数据。TPOT工具作为一种AutoML工具,用于优化特征和数据的准备,以及选择最合适的机器学习模型。TreeSHAP方法作为XAI的一种类型,用于通过评估每种代谢物对分类过程的个体贡献来解释模型。与其他AutoML方法AutoSKlearn和H2O AutoML以及传统机器学习模型(如随机森林、支持向量机和k近邻)相比,TPOT在区分HCC和肝硬化方面具有卓越的性能。TPOT技术获得了0.81的AUC值,与其他模型相比,展示出更高的准确性、敏感性和特异性。包括L - 缬氨酸、甘氨酸和DL - 异亮氨酸在内的关键代谢物被TPOT确定为重要物质,随后通过TreeSHAP分析得到验证。TreeSHAP对这些代谢物对模型预测的贡献提供了全面解释,从而提高了结果的可解释性和可靠性。这种全面评估突出了AutoML框架在临床生物标志物开发中的优势和可靠性。本研究表明,AutoML和XAI可共同用于创建特定于HCC的代谢组学生物标志物。与传统模型相比,TPOT的卓越性能突出了其识别生物标志物的能力。此外,TreeSHAP通过突出某些代谢物的相关性提高了模型透明度。这种综合方法有可能加强生物标志物的识别,并为诊断HCC生成精确且易于理解的人工智能驱动解决方案。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/99bd/11431471/70f386054340/diagnostics-14-02049-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验