Suppr超能文献

使用深度学习从电子病历中提取和分析医学知识

Medical Knowledge Extraction and Analysis from Electronic Medical Records Using Deep Learning.

作者信息

Li Pei-Lin, Yuan Zhen-Ming, Tu We-Nbo, Yu Kai, Lu Dong-Xin

机构信息

Engineering Research Center of Mobile Health Management, Ministry of Education, Hangzhou Normal University, Hangzhou 311121, China.

出版信息

Chin Med Sci J. 2019 Jun 30;34(2):133-139. doi: 10.24920/003589.

Abstract

Objectives Medical knowledge extraction (MKE) plays a key role in natural language processing (NLP) research in electronic medical records (EMR), which are the important digital carriers for recording medical activities of patients. Named entity recognition (NER) and medical relation extraction (MRE) are two basic tasks of MKE. This study aims to improve the recognition accuracy of these two tasks by exploring deep learning methods. Methods This study discussed and built two application scenes of bidirectional long short-term memory combined conditional random field (BiLSTM-CRF) model for NER and MRE tasks. In the data preprocessing of both tasks, a GloVe word embedding model was used to vectorize words. In the NER task, a sequence labeling strategy was used to classify each word tag by the joint probability distribution through the CRF layer. In the MRE task, the medical entity relation category was predicted by transforming the classification problem of a single entity into a sequence classification problem and linking the feature combinations between entities also through the CRF layer. Results Through the validation on the I2B2 2010 public dataset, the BiLSTM-CRF models built in this study got much better results than the baseline methods in the two tasks, where the F1-measure was up to 0.88 in NER task and 0.78 in MRE task. Moreover, the model converged faster and avoided problems such as overfitting. Conclusion This study proved the good performance of deep learning on medical knowledge extraction. It also verified the feasibility of the BiLSTM-CRF model in different application scenarios, laying the foundation for the subsequent work in the EMR field.

摘要

目标 医学知识提取(MKE)在电子病历(EMR)的自然语言处理(NLP)研究中起着关键作用,电子病历是记录患者医疗活动的重要数字载体。命名实体识别(NER)和医学关系提取(MRE)是MKE的两项基本任务。本研究旨在通过探索深度学习方法来提高这两项任务的识别准确率。方法 本研究讨论并构建了双向长短期记忆联合条件随机场(BiLSTM-CRF)模型用于NER和MRE任务的两个应用场景。在两项任务的数据预处理中,使用GloVe词嵌入模型将单词向量化。在NER任务中,采用序列标注策略通过CRF层根据联合概率分布对每个单词标签进行分类。在MRE任务中,通过将单个实体的分类问题转化为序列分类问题,并同样通过CRF层链接实体之间的特征组合来预测医学实体关系类别。结果 通过在I2B2 2010公共数据集上的验证,本研究构建的BiLSTM-CRF模型在这两项任务中比基线方法取得了更好的结果,其中NER任务的F1值高达0.88,MRE任务的F1值为0.78。此外,模型收敛更快,避免了过拟合等问题。结论 本研究证明了深度学习在医学知识提取方面的良好性能。还验证了BiLSTM-CRF模型在不同应用场景中的可行性,为电子病历领域的后续工作奠定了基础。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验