Liu Guangyi, Liao Yinghong, Wang Fuyu, Zhang Bin, Zhang Lu, Liang Xiaodan, Wan Xiang, Li Shaolin, Li Zhen, Zhang Shuixing, Cui Shuguang
IEEE Trans Neural Netw Learn Syst. 2021 Sep;32(9):3786-3797. doi: 10.1109/TNNLS.2021.3099165. Epub 2021 Aug 31.
Medical imaging technologies, including computed tomography (CT) or chest X-Ray (CXR), are largely employed to facilitate the diagnosis of the COVID-19. Since manual report writing is usually too time-consuming, a more intelligent auxiliary medical system that could generate medical reports automatically and immediately is urgently needed. In this article, we propose to use the medical visual language BERT (Medical-VLBERT) model to identify the abnormality on the COVID-19 scans and generate the medical report automatically based on the detected lesion regions. To produce more accurate medical reports and minimize the visual-and-linguistic differences, this model adopts an alternate learning strategy with two procedures that are knowledge pretraining and transferring. To be more precise, the knowledge pretraining procedure is to memorize the knowledge from medical texts, while the transferring procedure is to utilize the acquired knowledge for professional medical sentences generations through observations of medical images. In practice, for automatic medical report generation on the COVID-19 cases, we constructed a dataset of 368 medical findings in Chinese and 1104 chest CT scans from The First Affiliated Hospital of Jinan University, Guangzhou, China, and The Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, China. Besides, to alleviate the insufficiency of the COVID-19 training samples, our model was first trained on the large-scale Chinese CX-CHR dataset and then transferred to the COVID-19 CT dataset for further fine-tuning. The experimental results showed that Medical-VLBERT achieved state-of-the-art performances on terminology prediction and report generation with the Chinese COVID-19 CT dataset and the CX-CHR dataset. The Chinese COVID-19 CT dataset is available at https://covid19ct.github.io/.
包括计算机断层扫描(CT)或胸部X光(CXR)在内的医学成像技术在很大程度上被用于辅助诊断新型冠状病毒肺炎(COVID-19)。由于人工撰写报告通常耗时过长,因此迫切需要一个更智能的辅助医疗系统,能够自动且即时地生成医学报告。在本文中,我们建议使用医学视觉语言BERT(Medical-VLBERT)模型来识别COVID-19扫描图像上的异常情况,并根据检测到的病变区域自动生成医学报告。为了生成更准确的医学报告并尽量减少视觉与语言之间的差异,该模型采用了一种交替学习策略,包括知识预训练和迁移两个过程。更确切地说,知识预训练过程是从医学文本中记忆知识,而迁移过程则是通过观察医学图像,利用所获取的知识生成专业医学语句。在实践中,为了针对COVID-19病例自动生成医学报告,我们构建了一个数据集,其中包含来自中国广州暨南大学附属第一医院和中国珠海中山大学附属第五医院的368份中文医学检查结果和1104份胸部CT扫描。此外,为了缓解COVID-19训练样本不足的问题,我们的模型首先在大规模中文CX-CHR数据集上进行训练,然后转移到COVID-19 CT数据集上进行进一步微调。实验结果表明,Medical-VLBERT在中国COVID-19 CT数据集和CX-CHR数据集上的术语预测和报告生成方面达到了先进水平。中国COVID-19 CT数据集可在https://covid19ct.github.io/获取。