Sato Junya, Sugimoto Kento, Suzuki Yuki, Wataya Tomohiro, Kita Kosuke, Nishigaki Daiki, Tomiyama Miyuki, Hiraoka Yu, Hori Masatoshi, Takeda Toshihiro, Kido Shoji, Tomiyama Noriyuki
Department of Artificial Intelligence in Diagnostic Radiology, Osaka University Graduate School of Medicine, 2-2, Yamadaoka, Suita, Osaka, 565-0871, Japan; Department of Radiology, Osaka University Graduate School of Medicine, 2-2, Yamadaoka, Suita, Osaka, 565-0871, Japan.
Department of Medical Informatics, Osaka University Graduate School of Medicine, 2-2, Yamadaoka, Suita, Osaka, 565-0871, Japan.
EBioMedicine. 2024 Dec;110:105463. doi: 10.1016/j.ebiom.2024.105463. Epub 2024 Nov 28.
Artificial intelligence (AI) systems designed to detect abnormalities in abdominal computed tomography (CT) could reduce radiologists' workload and improve diagnostic processes. However, development of such models has been hampered by the shortage of large expert-annotated datasets. Here, we used information from free-text radiology reports, rather than manual annotations, to develop a deep-learning-based pipeline for comprehensive detection of abdominal CT abnormalities.
In this multicentre retrospective study, we developed a deep-learning-based pipeline to detect abnormalities in the liver, gallbladder, pancreas, spleen, and kidneys. Abdominal CT exams and related free-text reports obtained during routine clinical practice collected from three institutions were used for training and internal testing, while data collected from six institutions were used for external testing. A multi-organ segmentation model and an information extraction schema were used to extract specific organ images and disease information, CT images and radiology reports, respectively, which were used to train a multiple-instance learning model for anomaly detection. Its performance was evaluated using the area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, specificity, and F1 score against radiologists' ground-truth labels.
We trained the model for each organ on images selected from 66,684 exams (39,255 patients) and tested it on 300 (295 patients) and 600 (596 patients) exams for internal and external validation, respectively. In the external test cohort, the overall AUC for detecting organ abnormalities was 0.886. Whereas models trained on human-annotated labels performed better with the same number of exams, those trained on larger datasets with labels auto-extracted via the information extraction schema significantly outperformed human-annotated label-derived models.
Using disease information from routine clinical free-text radiology reports allows development of accurate anomaly detection models without requiring manual annotations. This approach is applicable to various anatomical sites and could streamline diagnostic processes.
Japan Science and Technology Agency.
旨在检测腹部计算机断层扫描(CT)异常的人工智能(AI)系统可以减轻放射科医生的工作量并改善诊断流程。然而,此类模型的开发受到大型专家注释数据集短缺的阻碍。在此,我们使用来自自由文本放射学报告的信息,而非人工注释,来开发一种基于深度学习的流程,用于全面检测腹部CT异常。
在这项多中心回顾性研究中,我们开发了一种基于深度学习的流程,以检测肝脏、胆囊、胰腺、脾脏和肾脏的异常。从三个机构收集的常规临床实践中获得的腹部CT检查及相关自由文本报告用于训练和内部测试,而从六个机构收集的数据用于外部测试。使用多器官分割模型和信息提取模式分别提取特定器官图像和疾病信息,即CT图像和放射学报告,用于训练用于异常检测的多实例学习模型。根据接受者操作特征曲线(AUC)下的面积、准确性、敏感性、特异性和F1分数,对照放射科医生的真实标签评估其性能。
我们在从66684次检查(39255名患者)中选择的图像上针对每个器官训练模型,并分别在300次(295名患者)和600次(596名患者)检查上进行内部和外部验证测试。在外部测试队列中,检测器官异常的总体AUC为0.886。虽然在相同数量的检查下,基于人工注释标签训练的模型表现更好,但在通过信息提取模式自动提取标签的更大数据集上训练的模型明显优于基于人工注释标签的模型。
使用常规临床自由文本放射学报告中的疾病信息可以开发出准确的异常检测模型,而无需人工注释。这种方法适用于各种解剖部位,并可以简化诊断流程。
日本科学技术振兴机构。