• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于关系图卷积网络的多源知识融合的中文临床命名实体识别。

Leveraging Multi-source knowledge for Chinese clinical named entity recognition via relational graph convolutional network.

机构信息

Department of Computer Science, Harbin Institute of Technology, Shenzhen, China; Peng Cheng Laboratory, Shenzhen, China.

Department of Computer Science, Harbin Institute of Technology, Shenzhen, China.

出版信息

J Biomed Inform. 2022 Apr;128:104035. doi: 10.1016/j.jbi.2022.104035. Epub 2022 Feb 23.

DOI:10.1016/j.jbi.2022.104035
PMID:35217186
Abstract

OBJECTIVE

External knowledge, such as lexicon of words in Chinese and domain knowledge graph (KG) of concepts, has been recently adopted to improve the performance of machine learning methods for named entity recognition (NER) as it can provide additional information beyond context. However, most existing studies only consider knowledge from one source (i.e., either lexicon or knowledge graph) in different ways and consider lexicon words or KG concepts independently with their boundaries. In this paper, we focus on leveraging multi-source knowledge in a unified manner where lexicon words or KG concepts are well combined with their boundaries for Chinese Clinical NER (CNER).

MATERIAL AND METHODS

We propose a novel method based on relational graph convolutional network (RGCN), called MKRGCN, to utilize multi-source knowledge in a unified manner for CNER. For any sentence, a relational graph based on words or concepts in each knowledge source is constructed, where lexicon words or KG concepts appearing in the sentence are linked to the containing tokens with the boundary information of the lexicon words or KG concepts. RGCN is used to model all relational graphs constructed from multi-source knowledge, and the representations of tokens from multi-source knowledge are integrated into the context representations of tokens via an attention mechanism. Based on the knowledge-enhanced representations of tokens, we deploy a conditional random field (CRF) layer for named entity label prediction. In this study, a lexicon of words and a medical knowledge graph are used as knowledge sources for Chinese CNER.

RESULTS

Our proposed method achieves the best performance on CCKS2017 and CCKS2018 in Chinese with F1-scores of 91.88% and 89.91%, respectively, significantly outperforming existing methods. The extended experiments on NCBI-Disease and BC2GM in English also prove the effectiveness of our method when only considering one knowledge source via RGCN.

CONCLUSION

The MKRGCN model can integrate knowledge from the external lexicon and knowledge graph effectively for Chinese CNER and has the potential to be applied to English NER.

摘要

目的

外部知识,如中文词汇和领域知识图谱(KG)中的概念,最近被用于提高命名实体识别(NER)的机器学习方法的性能,因为它可以提供上下文之外的额外信息。然而,大多数现有研究仅以不同的方式考虑来自单一来源(即词汇或知识图谱)的知识,并且独立考虑词汇词或 KG 概念及其边界。在本文中,我们专注于以统一的方式利用多源知识,即将词汇词或 KG 概念与其边界很好地结合起来用于中文临床 NER(CNER)。

材料和方法

我们提出了一种基于关系图卷积网络(RGCN)的新方法,称为 MKRGCN,用于以统一的方式利用多源知识进行 CNER。对于任何句子,构建基于每个知识源中的词汇或概念的关系图,其中句子中出现的词汇词或 KG 概念与词汇词或 KG 概念的包含令牌链接,并带有词汇词或 KG 概念的边界信息。使用 RGCN 对从多源知识构建的所有关系图进行建模,并通过注意力机制将多源知识的令牌表示集成到令牌的上下文表示中。基于知识增强的令牌表示,我们部署了条件随机场(CRF)层进行命名实体标签预测。在本研究中,词汇和医学知识图谱被用作中文 CNER 的知识源。

结果

我们提出的方法在中文的 CCKS2017 和 CCKS2018 上取得了最佳性能,F1 得分分别为 91.88%和 89.91%,明显优于现有方法。在仅通过 RGCN 考虑一个知识源的情况下,对英文的 NCBI-Disease 和 BC2GM 的扩展实验也证明了我们方法的有效性。

结论

MKRGCN 模型可以有效地整合来自外部词汇和知识图谱的知识,用于中文 CNER,并且有可能应用于英文 NER。

相似文献

1
Leveraging Multi-source knowledge for Chinese clinical named entity recognition via relational graph convolutional network.基于关系图卷积网络的多源知识融合的中文临床命名实体识别。
J Biomed Inform. 2022 Apr;128:104035. doi: 10.1016/j.jbi.2022.104035. Epub 2022 Feb 23.
2
Chinese Clinical Named Entity Recognition From Electronic Medical Records Based on Multisemantic Features by Using Robustly Optimized Bidirectional Encoder Representation From Transformers Pretraining Approach Whole Word Masking and Convolutional Neural Networks: Model Development and Validation.基于多语义特征,利用经过稳健优化的基于变换器预训练方法的全词掩码和卷积神经网络从电子病历中进行中文临床命名实体识别:模型开发与验证
JMIR Med Inform. 2023 May 10;11:e44597. doi: 10.2196/44597.
3
Chinese Clinical Named Entity Recognition in Electronic Medical Records: Development of a Lattice Long Short-Term Memory Model With Contextualized Character Representations.电子病历中的中文临床命名实体识别:基于上下文特征表示的格长短期记忆模型的开发
JMIR Med Inform. 2020 Sep 4;8(9):e19848. doi: 10.2196/19848.
4
Chinese clinical named entity recognition via multi-head self-attention based BiLSTM-CRF.基于多头自注意力机制的 BiLSTM-CRF 的中文临床命名实体识别。
Artif Intell Med. 2022 May;127:102282. doi: 10.1016/j.artmed.2022.102282. Epub 2022 Mar 18.
5
An attention-based deep learning model for clinical named entity recognition of Chinese electronic medical records.基于注意力的深度学习模型在中文电子病历临床命名实体识别中的应用。
BMC Med Inform Decis Mak. 2019 Dec 5;19(Suppl 5):235. doi: 10.1186/s12911-019-0933-6.
6
Multiple Embeddings Enhanced Multi-Graph Neural Networks for Chinese Healthcare Named Entity Recognition.多嵌入增强型多图神经网络在中文医疗保健命名实体识别中的应用。
IEEE J Biomed Health Inform. 2021 Jul;25(7):2801-2810. doi: 10.1109/JBHI.2020.3048700. Epub 2021 Jul 27.
7
Combinatorial feature embedding based on CNN and LSTM for biomedical named entity recognition.基于 CNN 和 LSTM 的组合特征嵌入的生物医学命名实体识别。
J Biomed Inform. 2020 Mar;103:103381. doi: 10.1016/j.jbi.2020.103381. Epub 2020 Jan 28.
8
Research on Chinese medical named entity recognition based on collaborative cooperation of multiple neural network models.基于多神经网络模型协同合作的中医命名实体识别研究
J Biomed Inform. 2020 Apr;104:103395. doi: 10.1016/j.jbi.2020.103395. Epub 2020 Feb 25.
9
Semantic-enhanced graph neural network for named entity recognition in ancient Chinese books.基于语义增强图神经网络的古籍命名实体识别
Sci Rep. 2024 Jul 30;14(1):17488. doi: 10.1038/s41598-024-68561-x.
10
BioByGANS: biomedical named entity recognition by fusing contextual and syntactic features through graph attention network in node classification framework.BioByGANS:通过图注意力网络在节点分类框架中融合上下文和句法特征进行生物医学命名实体识别。
BMC Bioinformatics. 2022 Nov 22;23(1):501. doi: 10.1186/s12859-022-05051-9.

引用本文的文献

1
Development and application of Chinese medical ontology for diabetes mellitus.中文医学本体在糖尿病领域的开发与应用。
BMC Med Inform Decis Mak. 2024 Jan 19;24(1):18. doi: 10.1186/s12911-023-02405-y.
2
Dictionary-based matching graph network for biomedical named entity recognition.基于词典匹配图网络的生物医学命名实体识别。
Sci Rep. 2023 Dec 8;13(1):21667. doi: 10.1038/s41598-023-48564-w.
3
Named Entity Recognition in Electronic Health Records: A Methodological Review.电子健康记录中的命名实体识别:方法学综述
Healthc Inform Res. 2023 Oct;29(4):286-300. doi: 10.4258/hir.2023.29.4.286. Epub 2023 Oct 31.
4
CLART: A cascaded lattice-and-radical transformer network for Chinese medical named entity recognition.CLART:一种用于中文医学命名实体识别的级联格与激进变压器网络。
Heliyon. 2023 Oct 10;9(10):e20692. doi: 10.1016/j.heliyon.2023.e20692. eCollection 2023 Oct.
5
Exploring the effects of drug, disease, and protein dependencies on biomedical named entity recognition: A comparative analysis.探索药物、疾病和蛋白质依赖性对生物医学命名实体识别的影响:一项比较分析。
Front Pharmacol. 2022 Dec 21;13:1020759. doi: 10.3389/fphar.2022.1020759. eCollection 2022.
6
A Complex Heterogeneous Network Model of Disease Regulated by Noncoding RNAs: A Case Study of Unstable Angina Pectoris.非编码 RNA 调控疾病的复杂异质网络模型:以不稳定型心绞痛为例。
Comput Intell Neurosci. 2022 Dec 23;2022:5852089. doi: 10.1155/2022/5852089. eCollection 2022.
7
A Multigranularity Text Driven Named Entity Recognition CGAN Model for Traditional Chinese Medicine Literatures.一种基于多粒度文本驱动的中医药文献命名实体识别 CGAN 模型。
Comput Intell Neurosci. 2022 Sep 24;2022:1495841. doi: 10.1155/2022/1495841. eCollection 2022.