Suppr超能文献

用于为接受结核病筛查的患者生成胸部X光片放射学报告的多模态生成式人工智能模型。

Multimodal Generative Artificial Intelligence Model for Creating Radiology Reports for Chest Radiographs in Patients Undergoing Tuberculosis Screening.

作者信息

Hong Eun Kyoung, Kim Hae Won, Song Ok Kyu, Lee Kyu-Chong, Kim Dong Kyu, Cho Jae-Bock, Kim Jungbin, Lee Seungho, Bae Woong, Roh Byungseok

机构信息

Mass General Brigham, Boston, USA.

St. Mary's Hospital, Seoul, South Korea.

出版信息

AJR Am J Roentgenol. 2025 Jul 2. doi: 10.2214/AJR.25.33059.

Abstract

Chest radiographs play a crucial role in tuberculosis screening in high-prevalence regions, although widespread radiographic screening requires expertise that may be unavailable in settings with limited medical resources. To evaluate a multimodal generative artificial intelligence (AI) model for detecting tuberculosis-associated abnormalities on chest radiography in patients undergoing tuberculosis screening. This retrospective study evaluated 800 chest radiographs obtained from two public datasets originating from tuberculosis screening programs. A generative AI model was used to create free-text reports for the radiographs. AI-generated reports were classified in terms of presence versus absence and laterality of tuberculosis-related abnormalities. Two radiologists independently reviewed the radiographs for tuberculosis presence and laterality in separate sessions, without and with use of AI-generated reports and recorded if they would accept the report without modification. Two additional radiologists reviewed radiographs and clinical readings from the datasets to determine the reference standard. By the reference standard, 422/800 radiographs were positive for tuberculosis-related abnormalities. For detection of tuberculosis-related abnormalities, sensitivity, specificity, and accuracy were 95.2%, 86.7%, and 90.8% for AI-generated reports; 93.1%, 93.6%, and 93.4% for reader 1 without AI-generated reports; 93.1%, 95.0%, and 94.1% for reader 1 with AI-generated reports; 95.8%, 87.2%, and 91.3% for reader 2 without AI-generated reports; and 95.8%, 91.5%, and 93.5% for reader 2 with AI-generated reports. Accuracy was significantly lower for AI-generated reports than for both readers alone (p<.001), but significantly higher with than without AI-generated reports for one reader (reader 1: p=.47; reader 2: p=.47). Localization performance was significantly lower (p<.001) for AI-generated reports (63.3%) than for reader 1 (79.9%) and reader 2 (77.9%) without AI-generated reports and did not significantly change for either reader with AI-generated reports (reader 1: 78.7%, p=.71; reader 2: 81.5%, p=.23). Among normal and abnormal radiographs, reader 1 accepted 91.7% and 52.4%, while reader 2 accepted 83.2% and 37.0%, respectively, of AI-generated reports. While AI-generated reports may augment radiologists' diagnostic assessments, the current model requires human oversight given inferior standalone performance. The generative AI model could have potential application to aid tuberculosis screening programs in medically underserved regions, although technical improvements remain required.

摘要

胸部X光片在结核病高流行地区的筛查中起着至关重要的作用,尽管广泛的X光片筛查需要专业知识,而在医疗资源有限的环境中可能无法获得这种专业知识。为了评估一种多模态生成式人工智能(AI)模型,用于在接受结核病筛查的患者的胸部X光片中检测与结核病相关的异常情况。这项回顾性研究评估了从两个源自结核病筛查项目的公共数据集中获取的800张胸部X光片。使用一个生成式AI模型为这些X光片创建自由文本报告。AI生成的报告根据是否存在与结核病相关的异常情况以及异常情况的部位进行分类。两名放射科医生在不同的环节分别独立审查X光片以确定是否存在结核病及异常情况的部位,审查时分别在不使用和使用AI生成的报告的情况下进行,并记录他们是否会不加修改地接受该报告。另外两名放射科医生审查了数据集中的X光片和临床读数以确定参考标准。根据参考标准,800张X光片中422张显示与结核病相关的异常情况呈阳性。对于检测与结核病相关的异常情况,AI生成的报告的敏感性、特异性和准确性分别为95.2%、86.7%和90.8%;不使用AI生成的报告时,读者1的敏感性、特异性和准确性分别为93.1%、93.6%和93.4%;使用AI生成的报告时,读者1的敏感性、特异性和准确性分别为93.1%、95.0%和94.1%;不使用AI生成的报告时,读者2的敏感性、特异性和准确性分别为95.8%、87.2%和91.3%;使用AI生成的报告时,读者2的敏感性、特异性和准确性分别为95.8%、91.5%和93.5%。AI生成的报告的准确性显著低于两名读者单独审查时的准确性(p<0.001),但对于一名读者(读者1:p = 0.47;读者2:p = 0.47),使用AI生成的报告时的准确性显著高于不使用时的准确性。AI生成的报告的定位性能(63.3%)显著低于不使用AI生成的报告时读者1的定位性能(79.9%)和读者2的定位性能(77.9%)(p<0.001),并且对于两名读者,使用AI生成的报告时定位性能均无显著变化(读者1:78.7%,p = 0.71;读者2:81.5%,p = 0.23)。在正常和异常X光片中,读者1分别接受了AI生成的报告的91.7%和52.4%,而读者2分别接受了83.2%和37.0%。虽然AI生成的报告可能会增强放射科医生的诊断评估,但鉴于当前模型单独使用时性能较差,仍需要人工监督。这种生成式AI模型可能有潜在应用,以帮助医疗服务不足地区的结核病筛查项目,尽管仍需要技术改进。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验