Suppr超能文献

基于关系注意力的双线性变换的中文医疗实体和关系联合抽取模型。

BAMRE: Joint extraction model of Chinese medical entities and relations based on Biaffine transformation with relation attention.

机构信息

Computer Science and Technology, Shandong University of Technology, Zibo, 255000, Shandong, China.

Agricultural Engineering and Food Science, Shandong University of Technology, Zibo, 255000, Shandong, China.

出版信息

J Biomed Inform. 2024 Oct;158:104733. doi: 10.1016/j.jbi.2024.104733. Epub 2024 Oct 3.

Abstract

Electronic Health Records (EHRs) contain various valuable medical entities and their relationships. Although the extraction of biomedical relationships has achieved good results in the mining of electronic health records and the construction of biomedical knowledge bases, there are still some problems. There may be implied complex associations between entities and relationships in overlapping triplets, and ignoring these interactions may lead to a decrease in the accuracy of entity extraction. To address this issue, a joint extraction model for medical entity relations based on a relation attention mechanism is proposed. The relation extraction module identifies candidate relationships within a sentence. The attention mechanism based on these relationships assigns weights to contextual words in the sentence that are associated with different relationships. Additionally, it extracts the subject and object entities. Under a specific relationship, entity vector representations are utilized to construct a global entity matching matrix based on Biaffine transformations. This matrix is designed to enhance the semantic dependencies and relational representations between entities, enabling triplet extraction. This allows the two subtasks of named entity recognition and relation extraction to be interrelated, fully utilizing contextual information within the sentence, and effectively addresses the issue of overlapping triplets. Experimental observations from the CMeIE Chinese medical relation extraction dataset and the Baidu2019 Chinese dataset confirm that our approach yields the superior F1 score across all cutting-edge baselines. Moreover, it offers substantial performance improvements in intricate situations involving diverse overlapping patterns, multitudes of triplets, and cross-sentence triplets.

摘要

电子健康记录 (EHRs) 包含各种有价值的医学实体及其关系。尽管在电子健康记录挖掘和生物医学知识库构建中,生物医学关系的提取已经取得了很好的效果,但仍存在一些问题。在重叠三元组中,实体和关系之间可能存在隐含的复杂关联,如果忽略这些相互作用,可能会导致实体提取的准确性下降。针对这个问题,提出了一种基于关系注意力机制的医学实体关系联合提取模型。关系提取模块在句子中识别候选关系。基于这些关系的注意力机制为与不同关系相关的句子中的上下文词分配权重。此外,它还提取了主语和宾语实体。在特定关系下,实体向量表示用于构建基于双线性变换的全局实体匹配矩阵。该矩阵旨在增强实体之间的语义依赖和关系表示,从而实现三元组提取。这使得命名实体识别和关系提取这两个子任务相互关联,充分利用句子中的上下文信息,有效解决了重叠三元组的问题。来自 CMeIE 中文医学关系提取数据集和百度 2019 中文数据集的实验观察结果表明,我们的方法在所有前沿基线中均取得了更高的 F1 分数。此外,它在涉及多种重叠模式、大量三元组和跨句子三元组的复杂情况下提供了显著的性能提升。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验