Suppr超能文献

基于语义和句法特征的具有长距离依存关系和噪声的实体之间的关系提取

Relationship extraction between entities with long distance dependencies and noise based on semantic and syntactic features.

作者信息

Wang Lei, Wu Fei, Liu Xiaoqing, Cao Jilong, Ma Mingwei, Qu Zhaoyang

机构信息

School of Computer Science, Northeast Electric Power University, Jilin, 132012, China.

Jilin Institute of Chemical Technology, Jilin, 132022, China.

出版信息

Sci Rep. 2025 May 6;15(1):15750. doi: 10.1038/s41598-025-00915-5.

Abstract

Relation extraction plays a crucial role in tasks such as text processing and knowledge graph construction. However, existing extraction algorithms struggle to maintain accuracy when dealing with long-distance dependencies between entities and noise interference. To address these challenges, this paper proposes a novel relation extraction method that integrates semantic and syntactic features for handling noisy long-distance dependencies. Specifically, we leverage contextual semantic features generated by the pre-trained BERT model alongside syntactic features derived from dependency syntax graphs, effectively utilizing the complementary strengths of both sources of information to enhance the model's performance in long-distance dependency scenarios. To further improve robustness, we introduce a Self-Attention-based Graph Convolutional Network (SA-GCN) to rank neighboring nodes within the syntactic graph, filtering out irrelevant nodes and capturing long-distance dependencies more precisely in noisy environments. Additionally, a residual shrinking network is incorporated to dynamically remove noise from the syntactic graph, further strengthening the model's noise resistance. Moreover, we propose a loss computation method based on predictive interpolation, which dynamically balances the contributions of semantic and syntactic features through weighted interpolation, thereby enhancing relation extraction accuracy. Experiments conducted on two public relation extraction datasets demonstrate that the proposed method achieves significant improvements in accuracy, particularly in handling long-distance dependencies and noise suppression.

摘要

关系抽取在文本处理和知识图谱构建等任务中起着至关重要的作用。然而,现有的抽取算法在处理实体之间的长距离依赖关系和噪声干扰时,难以保持准确性。为了应对这些挑战,本文提出了一种新颖的关系抽取方法,该方法整合了语义和句法特征来处理有噪声的长距离依赖关系。具体而言,我们利用预训练的BERT模型生成的上下文语义特征以及从依存句法图中导出的句法特征,有效地利用这两种信息源的互补优势,以提高模型在长距离依赖场景中的性能。为了进一步提高鲁棒性,我们引入了基于自注意力的图卷积网络(SA-GCN)来对句法图中的相邻节点进行排序,过滤掉不相关节点,并在有噪声的环境中更精确地捕捉长距离依赖关系。此外,还引入了一个残差收缩网络来动态地从句法图中去除噪声,进一步增强模型的抗噪声能力。此外,我们提出了一种基于预测插值的损失计算方法,该方法通过加权插值动态平衡语义和句法特征的贡献,从而提高关系抽取的准确性。在两个公共关系抽取数据集上进行的实验表明,所提出的方法在准确性方面取得了显著提高,特别是在处理长距离依赖关系和噪声抑制方面。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/26b5/12056020/693727b0c40b/41598_2025_915_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验