Wang Rui, Liang Jianguo
College of Computer, Qufu Normal University, Rizhao, 276800, Shandong, China.
College of Computer, Qufu Normal University, Rizhao, 276800, Shandong, China.
Comput Methods Programs Biomed. 2025 Sep;269:108877. doi: 10.1016/j.cmpb.2025.108877. Epub 2025 May 23.
The task of automatically generating radiology reports is a key research area at the intersection of computer science and medicine, aiming to enable computers to generate corresponding reports on the basis of radiology images. This field currently faces a significant data bias issue, which causes words describing diseases to be overshadowed by words describing normal regions in the reports.
To address this, we propose the label knowledge guided transformer model for generating radiology reports. Specifically, our model incorporates a Multi Feature Extraction module and a Dual-branch Collaborative Attention module. The Multi Feature Extraction module leverages medical knowledge graphs and feature clustering algorithms to optimize the label feature extraction process from both the prediction and encoding of label information, making it the first module specifically designed to reduce redundant label features. The Dual-branch Collaborative Attention module uses two parallel attention mechanisms to simultaneously compute visual features and label features, and prevents the direct integration of label features into visual features, thereby effectively balancing the model's attention between label features and visual features.
We conduct experimental tests using the IU X-Ray and MIMIC-CXR datasets under six natural language generation evaluation metrics and analyze the results. Experimental results demonstrate that our model achieves state-of-the-art (SOTA) performance. Compared with the baseline models, the label knowledge guided transformer achieves an average improvement of 23.3% on the IU X-Ray dataset and 20.7% on the MIMIC-CXR dataset.
Our model has strong capabilities in capturing abnormal features, effectively mitigating the adverse effects caused by data bias, and demonstrates significant potential to enhance the quality and accuracy of automatically generated radiology reports.
自动生成放射学报告的任务是计算机科学与医学交叉领域的一个关键研究方向,旨在使计算机能够基于放射学图像生成相应报告。该领域目前面临严重的数据偏差问题,这导致报告中描述疾病的词汇被描述正常区域的词汇所掩盖。
为解决此问题,我们提出了用于生成放射学报告的标签知识引导变压器模型。具体而言,我们的模型包含一个多特征提取模块和一个双分支协同注意力模块。多特征提取模块利用医学知识图谱和特征聚类算法,从标签信息的预测和编码两方面优化标签特征提取过程,使其成为首个专门设计用于减少冗余标签特征的模块。双分支协同注意力模块使用两个并行的注意力机制同时计算视觉特征和标签特征,并防止将标签特征直接整合到视觉特征中,从而有效平衡模型在标签特征和视觉特征之间的注意力。
我们使用IU X-Ray和MIMIC-CXR数据集在六种自然语言生成评估指标下进行了实验测试并分析结果。实验结果表明,我们的模型达到了当前最优(SOTA)性能。与基线模型相比,标签知识引导变压器在IU X-Ray数据集上平均提高了23.3%,在MIMIC-CXR数据集上提高了20.7%。
我们的模型在捕捉异常特征方面具有强大能力,有效减轻了数据偏差带来的不利影响,并展现出显著潜力来提高自动生成放射学报告的质量和准确性。