School of Computer Science and Technology, Heilongjiang University, Harbin, 150080, China.
School of Mathematical Science, Heilongjiang University, Harbin, 150080, China.
BMC Bioinformatics. 2020 Jun 5;21(1):230. doi: 10.1186/s12859-020-03554-x.
Inferring diseases related to the patient's electronic medical records (EMRs) is of great significance for assisting doctor diagnosis. Several recent prediction methods have shown that deep learning-based methods can learn the deep and complex information contained in EMRs. However, they do not consider the discriminative contributions of different phrases and words. Moreover, local information and context information of EMRs should be deeply integrated.
A new method based on the fusion of a convolutional neural network (CNN) and bidirectional long short-term memory (BiLSTM) with attention mechanisms is proposed for predicting a disease related to a given EMR, and it is referred to as FCNBLA. FCNBLA deeply integrates local information, context information of the word sequence and more informative phrases and words. A novel framework based on deep learning is developed to learn the local representation, the context representation and the combination representation. The left side of the framework is constructed based on CNN to learn the local representation of adjacent words. The right side of the framework based on BiLSTM focuses on learning the context representation of the word sequence. Not all phrases and words contribute equally to the representation of an EMR meaning. Therefore, we establish the attention mechanisms at the phrase level and word level, and the middle module of the framework learns the combination representation of the enhanced phrases and words. The macro average f-score and accuracy of FCNBLA achieved 91.29 and 92.78%, respectively.
The experimental results indicate that FCNBLA yields superior performance compared with several state-of-the-art methods. The attention mechanisms and combination representations are also confirmed to be helpful for improving FCNBLA's prediction performance. Our method is helpful for assisting doctors in diagnosing diseases in patients.
从电子病历(EMR)中推断与患者相关的疾病对于辅助医生诊断具有重要意义。最近的一些预测方法表明,基于深度学习的方法可以学习 EMR 中包含的深度和复杂信息。然而,它们没有考虑到不同短语和单词的判别贡献。此外,应该深入整合 EMR 的局部信息和上下文信息。
提出了一种基于卷积神经网络(CNN)和带有注意力机制的双向长短期记忆(BiLSTM)融合的新方法,用于预测给定 EMR 相关的疾病,称为 FCNBLA。FCNBLA 深入整合了局部信息、单词序列的上下文信息以及更具信息量的短语和单词。开发了一种基于深度学习的新框架,以学习局部表示、上下文表示和组合表示。该框架的左侧基于 CNN 构建,用于学习相邻单词的局部表示。该框架的右侧基于 BiLSTM,重点学习单词序列的上下文表示。并非所有短语和单词都对等贡献于 EMR 含义的表示。因此,我们在短语和单词级别建立了注意力机制,框架的中间模块学习增强的短语和单词的组合表示。FCNBLA 的宏平均 f 分数和准确率分别达到 91.29%和 92.78%。
实验结果表明,FCNBLA 的性能优于几种最新方法。注意力机制和组合表示也被证实有助于提高 FCNBLA 的预测性能。我们的方法有助于辅助医生诊断患者的疾病。