Zheng Wenbo, Yan Lan, Gou Chao, Zhang Zhi-Cheng, Jason Zhang Jun, Hu Ming, Wang Fei-Yue
School of Software Engineering, Xi'an Jiaotong University, Xi'an 710049, China.
State Key Laboratory for Management and Control of Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China.
Inf Fusion. 2021 Nov;75:168-185. doi: 10.1016/j.inffus.2021.05.015. Epub 2021 Jun 1.
The sudden increase in coronavirus disease 2019 (COVID-19) cases puts high pressure on healthcare services worldwide. At this stage, fast, accurate, and early clinical assessment of the disease severity is vital. In general, there are two issues to overcome: (1) Current deep learning-based works suffer from multimodal data adequacy issues; (2) In this scenario, multimodal (e.g., text, image) information should be taken into account together to make accurate inferences. To address these challenges, we propose a multi-modal knowledge graph attention embedding for COVID-19 diagnosis. Our method not only learns the relational embedding from nodes in a constituted knowledge graph but also has access to medical knowledge, aiming at improving the performance of the classifier through the mechanism of medical knowledge attention. The experimental results show that our approach significantly improves classification performance compared to other state-of-the-art techniques and possesses robustness for each modality from multi-modal data. Moreover, we construct a new COVID-19 multi-modal dataset based on text mining, consisting of 1393 doctor-patient dialogues and their 3706 images (347 X-ray 2598 CT 761 ultrasound) about COVID-19 patients and 607 non-COVID-19 patient dialogues and their 10754 images (9658 X-ray 494 CT 761 ultrasound), and the fine-grained labels of all. We hope this work can provide insights to the researchers working in this area to shift the attention from only medical images to the doctor-patient dialogue and its corresponding medical images.
2019冠状病毒病(COVID-19)病例的突然增加给全球医疗服务带来了巨大压力。在现阶段,对疾病严重程度进行快速、准确和早期的临床评估至关重要。一般来说,有两个问题需要克服:(1)当前基于深度学习的工作存在多模态数据充足性问题;(2)在这种情况下,应综合考虑多模态(如文本、图像)信息以做出准确推断。为应对这些挑战,我们提出了一种用于COVID-19诊断的多模态知识图谱注意力嵌入方法。我们的方法不仅从构建的知识图谱中的节点学习关系嵌入,还能获取医学知识,旨在通过医学知识注意力机制提高分类器的性能。实验结果表明,与其他现有技术相比,我们的方法显著提高了分类性能,并且对多模态数据中的每种模态都具有鲁棒性。此外,我们基于文本挖掘构建了一个新的COVID-19多模态数据集,该数据集由1393个关于COVID-19患者的医患对话及其3706张图像(347张X光、2598张CT、761张超声)以及607个非COVID-19患者对话及其10754张图像(9658张X光、494张CT、761张超声)组成,并且所有数据都有细粒度标签。我们希望这项工作能为该领域的研究人员提供思路,促使他们将注意力从仅关注医学图像转移到医患对话及其相应的医学图像上。