Liu Xinyao, Xin Junchang, Dai Bingtian, Shen Qi, Huang Zhihong, Wang Zhiqiong
College of Medicine and Biological Information Engineering, Northeastern University, 110819, China.
College of Computer Science and Engineering, Northeastern University, 110819, China.
Comput Methods Programs Biomed. 2025 Jan;258:108482. doi: 10.1016/j.cmpb.2024.108482. Epub 2024 Nov 14.
Automatic generation of medical reports reduces both the burden on radiologists and the possibility of errors due to the inexperience of radiologists. The model that utilizes attention mechanism and contrastive learning can generate medical reports by capturing both general and specific semantics. However, existing contrastive learning methods ignore the specificity of medical data, that is, a patient may suffer from multiple diseases at the same time. This means that the lack of fine-grained relationships for contrastive learning will lead to the problem of insufficient specificity.
To address the above problem, a label correlated contrastive learning method is proposed to encourage the model to generate higher-quality reports. Firstly, the refined similarity description matrix of the contrastive relationship between the reports is obtained by calculating the similarities between the multi-label classification of the reports. Secondly, the representations of image features and the embeddings containing semantic information from the decoder are projected into a hidden space. Thirdly, label correlated contrastive learning is performed with the hidden representations of the image, the embeddings of the text, and the similarity matrix. Through contrastive learning, the "hard" negative samples that share more labels with the target sample are being assigned more weights. Finally, label correlated contrastive learning and attention mechanism are combined to generate reports.
Comprehensive experiments are conducted on widely used datasets, IU X-ray and MIMIC-CXR. Specifically, on IU X-ray dataset, our method achieves METEOR and ROUGE-L scores of 0.198 and 0.392, respectively. On MIMIC-CXR dataset, our method achieves precision, recall, and F-1 scores of 0.384, 0.376, and 0.304, respectively. The results indicate that proposed method outperforms previous state-of-the-art models.
This work improves the performance of automatically generating medical reports, making their application in computer-aided diagnosis feasible.
医学报告的自动生成既减轻了放射科医生的负担,又降低了因放射科医生经验不足而导致错误的可能性。利用注意力机制和对比学习的模型可以通过捕捉一般语义和特定语义来生成医学报告。然而,现有的对比学习方法忽略了医学数据的特殊性,即患者可能同时患有多种疾病。这意味着对比学习缺乏细粒度关系会导致特异性不足的问题。
为了解决上述问题,提出了一种标签相关对比学习方法,以鼓励模型生成更高质量的报告。首先,通过计算报告的多标签分类之间的相似度,得到报告对比关系的精细相似度描述矩阵。其次,将图像特征的表示和来自解码器的包含语义信息的嵌入投影到一个隐藏空间中。第三,利用图像的隐藏表示、文本的嵌入和相似度矩阵进行标签相关对比学习。通过对比学习,与目标样本共享更多标签的“硬”负样本被赋予更大的权重。最后,将标签相关对比学习与注意力机制相结合来生成报告。
在广泛使用的数据集IU X射线和MIMIC-CXR上进行了综合实验。具体而言,在IU X射线数据集上,我们的方法分别实现了METEOR和ROUGE-L分数为0.198和0.392。在MIMIC-CXR数据集上,我们的方法分别实现了精确率、召回率和F-1分数为0.384、0.376和0.304。结果表明,所提出的方法优于先前的最先进模型。
这项工作提高了医学报告自动生成的性能,使其在计算机辅助诊断中的应用成为可能。