Institute for Systems and Computer Engineering, Technology and Science (INESC TEC), Porto, Portugal.
Faculty of Engineering of the University of Porto (FEUP), Porto, Portugal.
Sci Rep. 2022 Apr 21;12(1):6596. doi: 10.1038/s41598-022-10568-3.
The coronavirus disease 2019 (COVID-19) pandemic has impacted healthcare systems across the world. Chest radiography (CXR) can be used as a complementary method for diagnosing/following COVID-19 patients. However, experience level and workload of technicians and radiologists may affect the decision process. Recent studies suggest that deep learning can be used to assess CXRs, providing an important second opinion for radiologists and technicians in the decision process, and super-human performance in detection of COVID-19 has been reported in multiple studies. In this study, the clinical applicability of deep learning systems for COVID-19 screening was assessed by testing the performance of deep learning systems for the detection of COVID-19. Specifically, four datasets were used: (1) a collection of multiple public datasets (284.793 CXRs); (2) BIMCV dataset (16.631 CXRs); (3) COVIDGR (852 CXRs) and 4) a private dataset (6.361 CXRs). All datasets were collected retrospectively and consist of only frontal CXR views. A ResNet-18 was trained on each of the datasets for the detection of COVID-19. It is shown that a high dataset bias was present, leading to high performance in intradataset train-test scenarios (area under the curve 0.55-0.84 on the collection of public datasets). Significantly lower performances were obtained in interdataset train-test scenarios however (area under the curve > 0.98). A subset of the data was then assessed by radiologists for comparison to the automatic systems. Finetuning with radiologist annotations significantly increased performance across datasets (area under the curve 0.61-0.88) and improved the attention on clinical findings in positive COVID-19 CXRs. Nevertheless, tests on CXRs from different hospital services indicate that the screening performance of CXR and automatic systems is limited (area under the curve < 0.6 on emergency service CXRs). However, COVID-19 manifestations can be accurately detected when present, motivating the use of these tools for evaluating disease progression on mild to severe COVID-19 patients.
2019 年冠状病毒病(COVID-19)大流行对全球医疗体系造成了影响。胸部 X 线摄影(CXR)可作为诊断/随访 COVID-19 患者的辅助方法。然而,技术员和放射科医生的经验水平和工作量可能会影响决策过程。最近的研究表明,深度学习可用于评估 CXR,为放射科医生和技术员提供决策过程中的重要第二意见,并且在多项研究中报道了在 COVID-19 检测方面的超人性能。在这项研究中,通过测试深度学习系统检测 COVID-19 的性能,评估了深度学习系统在 COVID-19 筛查中的临床适用性。具体来说,使用了四个数据集:(1)多个公共数据集的集合(284,793 张 CXR);(2)BIMCV 数据集(16,631 张 CXR);(3)COVIDGR(852 张 CXR)和(4)私人数据集(6,361 张 CXR)。所有数据集均为回顾性收集,仅包含正面 CXR 视图。在每个数据集上都训练了 ResNet-18 以检测 COVID-19。结果表明存在很高的数据集偏差,导致在内部数据集训练-测试场景中表现出色(公共数据集集合的曲线下面积为 0.55-0.84)。但是,在跨数据集训练-测试场景中,性能显著降低(曲线下面积> 0.98)。然后,对数据的一部分进行了评估,以便与自动系统进行比较。通过放射科医生的注释进行微调可显著提高整个数据集的性能(曲线下面积 0.61-0.88),并改善了对阳性 COVID-19 CXR 中临床发现的关注。尽管如此,对来自不同医院服务的 CXR 的测试表明,CXR 和自动系统的筛选性能有限(急诊服务 CXR 的曲线下面积<0.6)。但是,当存在 COVID-19 表现时,可以准确地检测到 COVID-19,这激发了在轻度至重度 COVID-19 患者中使用这些工具评估疾病进展的动力。