Zhu Qingqing, Mathai Tejas Sudharshan, Mukherjee Pritam, Peng Yifan, Summers Ronald M, Lu Zhiyong
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
Imaging Biomarkers and Computer-Aided Diagnosis Laboratory, Department of Radiology and Imaging Sciences, National Institutes of Health Clinical Center, Bethesda, MD, USA.
Med Image Comput Comput Assist Interv. 2023 Oct;14224:189-198. doi: 10.1007/978-3-031-43904-9_19. Epub 2023 Oct 1.
Despite the reduction in turn-around times in radiology reporting with the use of speech recognition software, persistent communication errors can significantly impact the interpretation of radiology reports. Pre-filling a radiology report holds promise in mitigating reporting errors, and despite multiple efforts in literature to generate comprehensive medical reports, there lacks approaches that exploit the longitudinal nature of patient visit records in the MIMIC-CXR dataset. To address this gap, we propose to use longitudinal multi-modal data, i.e., previous patient visit CXR, current visit CXR, and the previous visit report, to pre-fill the "findings" section of the patient's current visit. We first gathered the longitudinal visit information for 26,625 patients from the MIMIC-CXR dataset, and created a new dataset called . With this new dataset, a transformer-based model was trained to capture the multi-modal longitudinal information from patient visit records (CXR images + reports) via a cross-attention-based multi-modal fusion module and a hierarchical memory-driven decoder. In contrast to previous works that only uses current visit data as input to train a model, our work exploits the longitudinal information available to pre-fill the "findings" section of radiology reports. Experiments show that our approach outperforms several recent approaches by ≥3% on F1 score, and ≥2% for BLEU-4, METEOR and ROUGE-L respectively. Code will be published at https://github.com/CelestialShine/Longitudinal-Chest-X-Ray.
尽管使用语音识别软件缩短了放射学报告的周转时间,但持续存在的通信错误仍会对放射学报告的解读产生重大影响。预填充放射学报告有望减少报告错误,尽管文献中多次尝试生成全面的医学报告,但缺乏利用MIMIC-CXR数据集中患者就诊记录的纵向性质的方法。为了弥补这一差距,我们建议使用纵向多模态数据,即患者之前的就诊胸部X光片、当前就诊的胸部X光片以及之前的就诊报告,来预填充患者当前就诊的“检查结果”部分。我们首先从MIMIC-CXR数据集中收集了26625名患者的纵向就诊信息,并创建了一个名为[具体数据集名称缺失]的新数据集。利用这个新数据集,训练了一个基于Transformer的模型,通过基于交叉注意力的多模态融合模块和分层记忆驱动解码器,从患者就诊记录(胸部X光图像+报告)中捕捉多模态纵向信息。与之前仅将当前就诊数据作为输入来训练模型的工作不同,我们的工作利用可用的纵向信息来预填充放射学报告的“检查结果”部分。实验表明,我们的方法在F1分数上比最近的几种方法高出≥3%,在BLEU-4、METEOR和ROUGE-L上分别高出≥2%。代码将发布在https://github.com/CelestialShine/Longitudinal-Chest-X-Ray。