Kim Kiduk, Cho Kyungjin, Eo Yujeong, Kim Jeeyoung, Yun Jihye, Ahn Yura, Seo Joon Beom, Hong Gil-Sun, Kim Namkug
Department of Convergence Medicine, University of Ulsan College of Medicine, Asan Medical Center, 88 Olympic-Ro 43-Gil, Songpa-Gu, Seoul, 05505, Republic of Korea.
Department of Biomedical Engineering, Asan Medical Institute of Convergence Science and Technology, University of Ulsan College of Medicine, Asan Medical Center, 88 Olympic-Ro 43-Gil, Songpa-Gu, Seoul, 05505, Republic of Korea.
J Imaging Inform Med. 2025 Apr;38(2):694-702. doi: 10.1007/s10278-024-01245-0. Epub 2024 Sep 11.
We aimed to evaluate the ability of deep learning (DL) models to identify patients from a paired chest radiograph (CXR) and compare their performance with that of human experts. In this retrospective study, patient identification DL models were developed using 240,004 CXRs. The models were validated using multiple datasets, namely, internal validation, CheXpert, and Chest ImaGenome (CIG), which include different populations. Model performance was analyzed in terms of disease change status. The performance of the models to identify patients from paired CXRs was compared with three junior radiology residents (group I), two senior radiology residents (group II), and two board-certified expert radiologists (group III). For the reader study, 240 patients (age, 56.617 ± 13.690 years, 113 females, 160 same pairs) were evaluated. A one-sided non-inferiority test was performed with a one-sided margin of 0.05. SimChest, our similarity-based DL model, demonstrated the best patient identification performance across multiple datasets, regardless of disease change status (internal validation [area under the receiver operating characteristic curve range: 0.992-0.999], CheXpert [0.933-0.948], and CIG [0.949-0.951]). The radiologists identified patients from the paired CXRs with a mean accuracy of 0.900 (95% confidence interval: 0.852-0.948), with performance increasing with experience (mean accuracy:group I [0.874], group II [0.904], group III [0.935], and SimChest [0.904]). SimChest achieved non-inferior performance compared to the radiologists (P for non-inferiority: 0.015). The findings of this diagnostic study indicate that DL models can screen for patient misidentification using a pair of CXRs non-inferiorly to human experts.
我们旨在评估深度学习(DL)模型从配对胸部X光片(CXR)中识别患者的能力,并将其性能与人类专家的性能进行比较。在这项回顾性研究中,使用240,004张胸部X光片开发了患者识别DL模型。这些模型使用多个数据集进行验证,即内部验证、CheXpert和胸部影像基因组(CIG),这些数据集包含不同人群。根据疾病变化状态分析模型性能。将从配对胸部X光片中识别患者的模型性能与三名初级放射科住院医师(第一组)、两名高级放射科住院医师(第二组)和两名获得委员会认证的放射科专家(第三组)进行比较。对于读者研究,评估了240名患者(年龄,56.617±13.690岁,113名女性,160对相同配对)。进行了单侧非劣效性检验,单侧界值为0.05。我们基于相似性的DL模型SimChest在多个数据集中表现出最佳的患者识别性能,无论疾病变化状态如何(内部验证[受试者操作特征曲线下面积范围:0.992 - 0.999]、CheXpert[0.933 - 0.948]和CIG[0.949 - 0.951])。放射科医生从配对胸部X光片中识别患者的平均准确率为0.900(95%置信区间:0.852 - 0.948),性能随着经验增加(平均准确率:第一组[0.874]、第二组[0.904]、第三组[0.935],SimChest[0.904])。与放射科医生相比,SimChest达到了非劣效性能(非劣效性P值:0.015)。这项诊断研究的结果表明,DL模型使用一对胸部X光片筛查患者误识别的能力不低于人类专家。