Suppr超能文献

利用自然语言处理技术自动检测日本放射学报告中可疑癌症的发现:一项多中心研究。

Automated Detection of Cancer-Suspicious Findings in Japanese Radiology Reports with Natural Language Processing: A Multicenter Study.

作者信息

Sugimoto Kento, Wada Shoya, Konishi Shozo, Sato Junya, Okada Katsuki, Kido Shoji, Tomiyama Noriyuki, Matsumura Yasushi, Takeda Toshihiro

机构信息

Department of Medical Informatics, Osaka University Graduate School of Medicine, 2-2 Yamadaoka, Suita, 565-0871, Osaka, Japan.

Department of Transformative System for Medical Information, Osaka University Graduate School of Medicine, 2-2, Yamadaoka, Suita, 565-0871, Osaka, Japan.

出版信息

J Imaging Inform Med. 2025 Jan 22. doi: 10.1007/s10278-024-01338-w.

Abstract

Missed critical imaging findings, particularly those indicating cancer, are a common issue that can result in delays in patient follow-up and treatment. To address this, we developed a rule-based natural language processing (NLP) algorithm to detect cancer-suspicious findings from Japanese radiology reports. The dataset used consisted of chest and abdomen CT reports from six institutions. Reports from our institution were used for algorithm development and internal evaluation, while reports from the other five institutions were used for external evaluation. To create the gold standard, reports were annotated by two experienced physicians. Data were statistically analyzed using precision, recall and F1 score with 1000 bootstrap iterations. BERT was used as a baseline deep learning model, and its performance was compared with the proposed rule-based method. At the report level of detection, the overall precision, recall, and F-1 score were 0.886, 0.886, and 0.883, respectively, for the rule-based algorithm, which were higher than those of the deep learning algorithm (0.851, 0.679, and 0.733). The overall results include both internal and external validation data. For the internal validation set, the precision, recall, and F-1 score were 0.929, 0.929, and 0.927, respectively. For the external validation set, the precision, recall, and F-1 score were 0.875, 0.879, and 0.873, demonstrating generalizability. In conclusion, we show the rule-based NLP algorithm exhibited a high performance in detecting cancer-suspicious findings from multi-institutional CT reports.

摘要

遗漏关键影像检查结果,尤其是那些提示癌症的结果,是一个常见问题,可能导致患者随访和治疗延迟。为解决这一问题,我们开发了一种基于规则的自然语言处理(NLP)算法,用于从日本放射学报告中检测可疑癌症的结果。所使用的数据集包括来自六个机构的胸部和腹部CT报告。我们机构的报告用于算法开发和内部评估,而其他五个机构的报告用于外部评估。为创建金标准,报告由两名经验丰富的医生进行注释。使用精度、召回率和F1分数进行统计分析,并进行1000次自助抽样迭代。BERT被用作基线深度学习模型,并将其性能与所提出的基于规则的方法进行比较。在报告检测层面,基于规则的算法的总体精度、召回率和F-1分数分别为0.886、0.886和0.883,高于深度学习算法(0.851、0.679和0.733)。总体结果包括内部和外部验证数据。对于内部验证集,精度、召回率和F-1分数分别为0.929、0.929和0.927。对于外部验证集,精度、召回率和F-1分数分别为0.875、0.879和0.873,表明具有可推广性。总之,我们表明基于规则的NLP算法在从多机构CT报告中检测可疑癌症结果方面表现出高性能。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验