Department of Medical Ultrasonics, Institute for Diagnostic and Interventional Ultrasound, The First Affiliated Hospital, Sun Yat-Sen University, No. 58, Zhongshan Er Road, Guangzhou, 510080, People's Republic of China.
School of Computer Science and Engineering, Sun Yat-Sen University, No. 132, East Outer Ring Road, Guangzhou, 510006, People's Republic of China.
BMC Med. 2024 Jan 25;22(1):29. doi: 10.1186/s12916-024-03247-9.
A previously trained deep learning-based smartphone app provides an artificial intelligence solution to help diagnose biliary atresia from sonographic gallbladder images, but it might be impractical to launch it in real clinical settings. This study aimed to redevelop a new model using original sonographic images and their derived smartphone photos and then test the new model's performance in assisting radiologists with different experiences to detect biliary atresia in real-world mimic settings.
A new model was first trained retrospectively using 3659 original sonographic gallbladder images and their derived 51,226 smartphone photos and tested on 11,410 external validation smartphone photos. Afterward, the new model was tested in 333 prospectively collected sonographic gallbladder videos from 207 infants by 14 inexperienced radiologists (9 juniors and 5 seniors) and 4 experienced pediatric radiologists in real-world mimic settings. Diagnostic performance was expressed as the area under the receiver operating characteristic curve (AUC).
The new model outperformed the previously published model in diagnosing BA on the external validation set (AUC 0.924 vs 0.908, P = 0.004) with higher consistency (kappa value 0.708 vs 0.609). When tested in real-world mimic settings using 333 sonographic gallbladder videos, the new model performed comparable to experienced pediatric radiologists (average AUC 0.860 vs 0.876) and outperformed junior radiologists (average AUC 0.838 vs 0.773) and senior radiologists (average AUC 0.829 vs 0.749). Furthermore, the new model could aid both junior and senior radiologists to improve their diagnostic performances, with the average AUC increasing from 0.773 to 0.835 for junior radiologists and from 0.749 to 0.805 for senior radiologists.
The interpretable app-based model showed robust and satisfactory performance in diagnosing biliary atresia, and it could aid radiologists with limited experiences to improve their diagnostic performances in real-world mimic settings.
一款先前经过训练的基于深度学习的智能手机应用程序提供了一种人工智能解决方案,可帮助从超声胆囊图像中诊断胆道闭锁,但在实际临床环境中推出该应用程序可能并不实际。本研究旨在使用原始超声图像及其衍生的智能手机照片重新开发一种新模型,然后在真实模拟环境中测试该新模型在帮助不同经验的放射科医生检测胆道闭锁方面的性能。
首先使用 3659 张原始超声胆囊图像及其衍生的 51226 张智能手机照片对新模型进行回顾性训练,并在 11410 张外部验证智能手机照片上进行测试。之后,在真实模拟环境中,由 14 名无经验的放射科医生(9 名初级医生和 5 名高级医生)和 4 名有经验的儿科放射科医生对 207 名婴儿的 333 个前瞻性采集的超声胆囊视频进行新模型测试。诊断性能表示为受试者工作特征曲线下的面积(AUC)。
新模型在外部验证集中诊断 BA 的表现优于先前发表的模型(AUC 0.924 比 0.908,P=0.004),一致性更高(kappa 值 0.708 比 0.609)。在使用 333 个超声胆囊视频的真实模拟环境中进行测试时,新模型与有经验的儿科放射科医生的表现相当(平均 AUC 0.860 比 0.876),优于初级放射科医生(平均 AUC 0.838 比 0.773)和高级放射科医生(平均 AUC 0.829 比 0.749)。此外,新模型可以帮助初级和高级放射科医生提高他们的诊断性能,初级放射科医生的平均 AUC 从 0.773 增加到 0.835,高级放射科医生的平均 AUC 从 0.749 增加到 0.805。
基于可解释应用程序的模型在诊断胆道闭锁方面表现出稳健且令人满意的性能,并且可以帮助经验有限的放射科医生提高他们在真实模拟环境中的诊断性能。