Suppr超能文献

基于卷积神经网络的中文电子病历智能诊断。

Intelligent diagnosis with Chinese electronic medical records based on convolutional neural networks.

机构信息

College of Computer Science and Technology, Huaqiao University, Xiamen, 361021, China.

Research Department, Zhiye software, Xiamen, 361021, China.

出版信息

BMC Bioinformatics. 2019 Feb 1;20(1):62. doi: 10.1186/s12859-019-2617-8.

Abstract

BACKGROUND

Benefiting from big data, powerful computation and new algorithmic techniques, we have been witnessing the renaissance of deep learning, particularly the combination of natural language processing (NLP) and deep neural networks. The advent of electronic medical records (EMRs) has not only changed the format of medical records but also helped users to obtain information faster. However, there are many challenges regarding researching directly using Chinese EMRs, such as low quality, huge quantity, imbalance, semi-structure and non-structure, particularly the high density of the Chinese language compared with English. Therefore, effective word segmentation, word representation and model architecture are the core technologies in the literature on Chinese EMRs.

RESULTS

In this paper, we propose a deep learning framework to study intelligent diagnosis using Chinese EMR data, which incorporates a convolutional neural network (CNN) into an EMR classification application. The novelty of this paper is reflected in the following: (1) We construct a pediatric medical dictionary based on Chinese EMRs. (2) Word2vec adopted in word embedding is used to achieve the semantic description of the content of Chinese EMRs. (3) A fine-tuning CNN model is constructed to feed the pediatric diagnosis with Chinese EMR data. Our results on real-world pediatric Chinese EMRs demonstrate that the average accuracy and F1-score of the CNN models are up to 81%, which indicates the effectiveness of the CNN model for the classification of EMRs. Particularly, a fine-tuning one-layer CNN performs best among all CNNs, recurrent neural network (RNN) (long short-term memory, gated recurrent unit) and CNN-RNN models, and the average accuracy and F1-score are both up to 83%.

CONCLUSION

The CNN framework that includes word segmentation, word embedding and model training can serve as an intelligent auxiliary diagnosis tool for pediatricians. Particularly, a fine-tuning one-layer CNN performs well, which indicates that word order does not appear to have a useful effect on our Chinese EMRs.

摘要

背景

受益于大数据、强大的计算能力和新的算法技术,我们见证了深度学习的复兴,特别是自然语言处理(NLP)和深度神经网络的结合。电子病历(EMR)的出现不仅改变了病历的格式,还帮助用户更快地获取信息。然而,直接使用中文 EMR 进行研究存在许多挑战,例如质量低、数量大、不平衡、半结构化和非结构化,尤其是与英语相比,中文的密度更高。因此,有效的分词、词表示和模型架构是中文 EMR 文献研究的核心技术。

结果

本文提出了一种基于深度学习的框架,利用中文 EMR 数据进行智能诊断,将卷积神经网络(CNN)应用于 EMR 分类应用中。本文的创新之处在于:(1)我们基于中文 EMR 构建了儿科医学词典。(2)采用词向量进行词嵌入,实现中文 EMR 内容的语义描述。(3)构建一个微调 CNN 模型,为儿科诊断提供中文 EMR 数据。我们在真实的儿科中文 EMR 上的结果表明,CNN 模型的平均准确率和 F1 分数高达 81%,表明 CNN 模型在 EMR 分类中的有效性。特别是,在所有的 CNN、递归神经网络(RNN)(长短期记忆、门控循环单元)和 CNN-RNN 模型中,一层微调 CNN 的性能最好,平均准确率和 F1 分数均高达 83%。

结论

包含分词、词嵌入和模型训练的 CNN 框架可以作为儿科医生的智能辅助诊断工具。特别是,一层微调 CNN 表现良好,这表明在我们的中文 EMR 中,词序似乎没有有用的效果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f51a/6359854/5dd245747425/12859_2019_2617_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验