Wang Lei, Wu Fei, Liu Xiaoqing, Cao Jilong, Ma Mingwei, Qu Zhaoyang
School of Computer Science, Northeast Electric Power University, Jilin, 132012, China.
Jilin Institute of Chemical Technology, Jilin, 132022, China.
Sci Rep. 2025 May 6;15(1):15750. doi: 10.1038/s41598-025-00915-5.
Relation extraction plays a crucial role in tasks such as text processing and knowledge graph construction. However, existing extraction algorithms struggle to maintain accuracy when dealing with long-distance dependencies between entities and noise interference. To address these challenges, this paper proposes a novel relation extraction method that integrates semantic and syntactic features for handling noisy long-distance dependencies. Specifically, we leverage contextual semantic features generated by the pre-trained BERT model alongside syntactic features derived from dependency syntax graphs, effectively utilizing the complementary strengths of both sources of information to enhance the model's performance in long-distance dependency scenarios. To further improve robustness, we introduce a Self-Attention-based Graph Convolutional Network (SA-GCN) to rank neighboring nodes within the syntactic graph, filtering out irrelevant nodes and capturing long-distance dependencies more precisely in noisy environments. Additionally, a residual shrinking network is incorporated to dynamically remove noise from the syntactic graph, further strengthening the model's noise resistance. Moreover, we propose a loss computation method based on predictive interpolation, which dynamically balances the contributions of semantic and syntactic features through weighted interpolation, thereby enhancing relation extraction accuracy. Experiments conducted on two public relation extraction datasets demonstrate that the proposed method achieves significant improvements in accuracy, particularly in handling long-distance dependencies and noise suppression.
关系抽取在文本处理和知识图谱构建等任务中起着至关重要的作用。然而,现有的抽取算法在处理实体之间的长距离依赖关系和噪声干扰时,难以保持准确性。为了应对这些挑战,本文提出了一种新颖的关系抽取方法,该方法整合了语义和句法特征来处理有噪声的长距离依赖关系。具体而言,我们利用预训练的BERT模型生成的上下文语义特征以及从依存句法图中导出的句法特征,有效地利用这两种信息源的互补优势,以提高模型在长距离依赖场景中的性能。为了进一步提高鲁棒性,我们引入了基于自注意力的图卷积网络(SA-GCN)来对句法图中的相邻节点进行排序,过滤掉不相关节点,并在有噪声的环境中更精确地捕捉长距离依赖关系。此外,还引入了一个残差收缩网络来动态地从句法图中去除噪声,进一步增强模型的抗噪声能力。此外,我们提出了一种基于预测插值的损失计算方法,该方法通过加权插值动态平衡语义和句法特征的贡献,从而提高关系抽取的准确性。在两个公共关系抽取数据集上进行的实验表明,所提出的方法在准确性方面取得了显著提高,特别是在处理长距离依赖关系和噪声抑制方面。