Skolkovo Institute of Science and Technology, Bolshoy blvd., 30/1, Moscow, 121205, Russia.
Philips (Russia), Skolkovo Technopark 42, Building 1, Bolshoi Boulevard, Moscow, 121205, Russia.
Sci Rep. 2023 Mar 13;13(1):4171. doi: 10.1038/s41598-023-31223-5.
The proposed model for automatic clinical image caption generation combines the analysis of radiological scans with structured patient information from the textual records. It uses two language models, the Show-Attend-Tell and the GPT-3, to generate comprehensive and descriptive radiology records. The generated textual summary contains essential information about pathologies found, their location, along with the 2D heatmaps that localize each pathology on the scans. The model has been tested on two medical datasets, the Open-I, MIMIC-CXR, and the general-purpose MS-COCO, and the results measured with natural language assessment metrics demonstrated its efficient applicability to chest X-ray image captioning.
该自动临床图像字幕生成模型结合了放射学扫描的分析和来自文本记录的结构化患者信息。它使用两种语言模型,即 Show-Attend-Tell 和 GPT-3,生成全面而描述性的放射学记录。生成的文本摘要包含有关发现的病理学、其位置的基本信息,以及在扫描上定位每个病理学的 2D 热图。该模型已经在两个医学数据集,即 Open-I、MIMIC-CXR 和通用的 MS-COCO 上进行了测试,并且使用自然语言评估指标衡量的结果表明它可以有效地应用于胸部 X 射线图像字幕生成。