Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA.
Division of Thoracic Imaging and Intervention, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA.
Medicine (Baltimore). 2022 Jul 22;101(29):e29587. doi: 10.1097/MD.0000000000029587.
To tune and test the generalizability of a deep learning-based model for assessment of COVID-19 lung disease severity on chest radiographs (CXRs) from different patient populations. A published convolutional Siamese neural network-based model previously trained on hospitalized patients with COVID-19 was tuned using 250 outpatient CXRs. This model produces a quantitative measure of COVID-19 lung disease severity (pulmonary x-ray severity (PXS) score). The model was evaluated on CXRs from 4 test sets, including 3 from the United States (patients hospitalized at an academic medical center (N = 154), patients hospitalized at a community hospital (N = 113), and outpatients (N = 108)) and 1 from Brazil (patients at an academic medical center emergency department (N = 303)). Radiologists from both countries independently assigned reference standard CXR severity scores, which were correlated with the PXS scores as a measure of model performance (Pearson R). The Uniform Manifold Approximation and Projection (UMAP) technique was used to visualize the neural network results. Tuning the deep learning model with outpatient data showed high model performance in 2 United States hospitalized patient datasets (R = 0.88 and R = 0.90, compared to baseline R = 0.86). Model performance was similar, though slightly lower, when tested on the United States outpatient and Brazil emergency department datasets (R = 0.86 and R = 0.85, respectively). UMAP showed that the model learned disease severity information that generalized across test sets. A deep learning model that extracts a COVID-19 severity score on CXRs showed generalizable performance across multiple populations from 2 continents, including outpatients and hospitalized patients.
为了调整和测试基于深度学习的 COVID-19 肺部疾病严重程度评估模型在来自不同患者人群的胸部 X 光片(CXR)上的泛化能力。之前使用住院 COVID-19 患者训练的发表的基于卷积 Siamese 神经网络的模型使用 250 张门诊 CXR 进行了调整。该模型产生了 COVID-19 肺部疾病严重程度的定量测量(肺部 X 射线严重程度(PXS)评分)。该模型在 4 个测试集中进行了评估,包括来自美国的 3 个(在学术医疗中心住院的患者(N=154),在社区医院住院的患者(N=113)和门诊患者(N=108))和 1 个来自巴西(学术医疗中心急诊部的患者(N=303))。来自两个国家的放射科医生独立分配了参考标准的 CXR 严重程度评分,这些评分与 PXS 评分相关联,作为衡量模型性能的指标(Pearson R)。使用统一流形逼近和投影(UMAP)技术可视化神经网络结果。使用门诊数据调整深度学习模型在 2 个美国住院患者数据集上显示出较高的模型性能(R=0.88 和 R=0.90,与基线 R=0.86 相比)。当在来自美国的门诊和巴西急诊部数据集上进行测试时,模型性能相似,只是略低(R=0.86 和 R=0.85)。UMAP 显示模型学习了可在测试集中普遍适用的疾病严重程度信息。一种从 2 个大洲的多个人群中提取 CXR 上 COVID-19 严重程度评分的深度学习模型表现出可泛化的性能,包括门诊患者和住院患者。