Suppr超能文献

一种使用集成学习算法快速诊断 COVID-19 的可解释人工智能方法。

An Explainable AI Approach for the Rapid Diagnosis of COVID-19 Using Ensemble Learning Algorithms.

机构信息

Department of Software Engineering, College of Computer Science and Electronic Engineering, Hunan University, Changsha, China.

Academy of Military Sciences, Beijing, China.

出版信息

Front Public Health. 2022 Jun 21;10:874455. doi: 10.3389/fpubh.2022.874455. eCollection 2022.

Abstract

BACKGROUND

Artificial intelligence-based disease prediction models have a greater potential to screen COVID-19 patients than conventional methods. However, their application has been restricted because of their underlying black-box nature.

OBJECTIVE

To addressed this issue, an explainable artificial intelligence (XAI) approach was developed to screen patients for COVID-19.

METHODS

A retrospective study consisting of 1,737 participants (759 COVID-19 patients and 978 controls) admitted to San Raphael Hospital (OSR) from February to May 2020 was used to construct a diagnosis model. Finally, 32 key blood test indices from 1,374 participants were used for screening patients for COVID-19. Four ensemble learning algorithms were used: random forest (RF), adaptive boosting (AdaBoost), gradient boosting decision tree (GBDT), and extreme gradient boosting (XGBoost). Feature importance from the perspective of the clinical domain and visualized interpretations were illustrated by using local interpretable model-agnostic explanations (LIME) plots.

RESULTS

The GBDT model [area under the curve (AUC): 86.4%; 95% confidence interval (CI) 0.821-0.907] outperformed the RF model (AUC: 85.7%; 95% CI 0.813-0.902), AdaBoost model (AUC: 85.4%; 95% CI 0.810-0.899), and XGBoost model (AUC: 84.9%; 95% CI 0.803-0.894) in distinguishing patients with COVID-19 from those without. The cumulative feature importance of lactate dehydrogenase, white blood cells, and eosinophil counts was 0.145, 0.130, and 0.128, respectively.

CONCLUSIONS

Ensemble machining learning (ML) approaches, mainly GBDT and LIME plots, are efficient for screening patients with COVID-19 and might serve as a potential tool in the auxiliary diagnosis of COVID-19. Patients with higher WBC count, higher LDH level, or higher EOT count, were more likely to have COVID-19.

摘要

背景

基于人工智能的疾病预测模型比传统方法更有潜力筛选 COVID-19 患者。然而,由于其潜在的黑盒性质,它们的应用受到了限制。

目的

为了解决这个问题,开发了一种可解释的人工智能(XAI)方法来筛选 COVID-19 患者。

方法

回顾性研究包括 2020 年 2 月至 5 月期间在圣拉斐尔医院(OSR)收治的 1737 名患者(759 名 COVID-19 患者和 978 名对照),用于构建诊断模型。最后,使用 1374 名参与者的 32 个关键血液测试指标来筛选 COVID-19 患者。使用了四种集成学习算法:随机森林(RF)、自适应提升(AdaBoost)、梯度提升决策树(GBDT)和极端梯度提升(XGBoost)。使用局部可解释模型不可知解释(LIME)图从临床角度说明特征重要性和可视化解释。

结果

GBDT 模型(AUC:86.4%;95%置信区间[CI]0.821-0.907)优于 RF 模型(AUC:85.7%;95%CI0.813-0.902)、AdaBoost 模型(AUC:85.4%;95%CI0.810-0.899)和 XGBoost 模型(AUC:84.9%;95%CI0.803-0.894),用于区分 COVID-19 患者和非 COVID-19 患者。乳酸脱氢酶、白细胞和嗜酸性粒细胞计数的累积特征重要性分别为 0.145、0.130 和 0.128。

结论

集成机器学习(ML)方法,主要是 GBDT 和 LIME 图,对于筛选 COVID-19 患者非常有效,可能成为 COVID-19 辅助诊断的潜在工具。白细胞计数、LDH 水平或 EOT 计数较高的患者更有可能患有 COVID-19。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/116b/9253566/7a515c8e8ba6/fpubh-10-874455-g0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验