Suppr超能文献

用决策树和准则图解释基于常规血液检测的 COVID-19 机器学习诊断。

Explaining machine learning based diagnosis of COVID-19 from routine blood tests with decision trees and criteria graphs.

机构信息

Graduate Program in Electrical Engineering, Universidade Federal de Minas Gerais, Av. Antônio Carlos 6627, 31270-901, Belo Horizonte, MG, Brazil.

Graduate Program in Electrical Engineering, Universidade Federal de Minas Gerais, Av. Antônio Carlos 6627, 31270-901, Belo Horizonte, MG, Brazil.

出版信息

Comput Biol Med. 2021 May;132:104335. doi: 10.1016/j.compbiomed.2021.104335. Epub 2021 Mar 16.

Abstract

The sudden outbreak of coronavirus disease 2019 (COVID-19) revealed the need for fast and reliable automatic tools to help health teams. This paper aims to present understandable solutions based on Machine Learning (ML) techniques to deal with COVID-19 screening in routine blood tests. We tested different ML classifiers in a public dataset from the Hospital Albert Einstein, São Paulo, Brazil. After cleaning and pre-processing the data has 608 patients, of which 84 are positive for COVID-19 confirmed by RT-PCR. To understand the model decisions, we introduce (i) a local Decision Tree Explainer (DTX) for local explanation and (ii) a Criteria Graph to aggregate these explanations and portrait a global picture of the results. Random Forest (RF) classifier achieved the best results (accuracy 0.88, F1-score 0.76, sensitivity 0.66, specificity 0.91, and AUROC 0.86). By using DTX and Criteria Graph for cases confirmed by the RF, it was possible to find some patterns among the individuals able to aid the clinicians to understand the interconnection among the blood parameters either globally or on a case-by-case basis. The results are in accordance with the literature and the proposed methodology may be embedded in an electronic health record system.

摘要

2019 年冠状病毒病(COVID-19)的突然爆发凸显了需要快速可靠的自动工具来帮助医疗团队。本文旨在提出基于机器学习(ML)技术的易于理解的解决方案,以应对常规血液检测中的 COVID-19 筛查。我们在巴西圣保罗爱因斯坦医院的一个公共数据集上测试了不同的 ML 分类器。在清理和预处理数据后,共有 608 名患者,其中 84 名经 RT-PCR 检测确诊为 COVID-19 阳性。为了理解模型决策,我们引入了(i)局部决策树解释器(DTX)进行局部解释,以及(ii)准则图来聚合这些解释,并描绘结果的全局图景。随机森林(RF)分类器取得了最佳结果(准确率 0.88、F1 分数 0.76、灵敏度 0.66、特异性 0.91 和 AUROC 0.86)。通过在 RF 确诊的病例中使用 DTX 和准则图,我们可以找到一些个体之间的模式,这些模式可以帮助临床医生理解血液参数之间的相互联系,无论是全局还是逐个病例。结果与文献一致,所提出的方法可以嵌入电子健康记录系统中。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/56e0/7962588/884ffc17e51d/gr1_lrg.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验