School of Information Engineering, Dalian Ocean University, Dalian, China.
College of Computer Science and Technology, Dalian University of Technology, Dalian, China.
BMC Med Inform Decis Mak. 2021 Jul 30;21(Suppl 2):55. doi: 10.1186/s12911-021-01426-9.
Clinical notes record the health status, clinical manifestations and other detailed information of each patient. The International Classification of Diseases (ICD) codes are important labels for electronic health records. Automatic medical codes assignment to clinical notes through the deep learning model can not only improve work efficiency and accelerate the development of medical informatization but also facilitate the resolution of many issues related to medical insurance. Recently, neural network-based methods have been proposed for the automatic medical code assignment. However, in the medical field, clinical notes are usually long documents and contain many complex sentences, most of the current methods cannot effective in learning the representation of potential features from document text.
In this paper, we propose a hybrid capsule network model. Specifically, we use bi-directional LSTM (Bi-LSTM) with forwarding and backward directions to merge the information from both sides of the sequence. The label embedding framework embeds the text and labels together to leverage the label information. We then use a dynamic routing algorithm in the capsule network to extract valuable features for medical code prediction task.
We applied our model to the task of automatic medical codes assignment to clinical notes and conducted a series of experiments based on MIMIC-III data. The experimental results show that our method achieves a micro F1-score of 67.5% on MIMIC-III dataset, which outperforms the other state-of-the-art methods.
The proposed model employed the dynamic routing algorithm and label embedding framework can effectively capture the important features across sentences. Both Capsule networks and domain knowledge are helpful for medical code prediction task.
临床记录记录了每位患者的健康状况、临床表现和其他详细信息。国际疾病分类(ICD)代码是电子健康记录的重要标签。通过深度学习模型将医疗代码自动分配给临床记录,不仅可以提高工作效率,加速医疗信息化的发展,还可以方便解决许多与医疗保险相关的问题。最近,已经提出了基于神经网络的方法来实现医疗自动编码。然而,在医疗领域,临床记录通常是长文档,包含许多复杂的句子,当前大多数方法都无法有效地从文档文本中学习潜在特征的表示。
在本文中,我们提出了一种混合胶囊网络模型。具体来说,我们使用具有前向和后向的双向 LSTM(Bi-LSTM)来合并序列两侧的信息。标签嵌入框架将文本和标签嵌入在一起,以利用标签信息。然后,我们在胶囊网络中使用动态路由算法来提取对医疗代码预测任务有价值的特征。
我们将我们的模型应用于临床记录的自动医疗代码分配任务,并基于 MIMIC-III 数据进行了一系列实验。实验结果表明,我们的方法在 MIMIC-III 数据集上的微 F1 得分为 67.5%,优于其他最先进的方法。
所提出的模型采用动态路由算法和标签嵌入框架,可以有效地捕获跨句子的重要特征。胶囊网络和领域知识都有助于医疗代码预测任务。