Chen Yang, Shi Bowen
State Key Lab of Software Development Environment, Beihang University, Beijing 100191, China.
School of Journalism, Communication University of China, Beijing 100024, China.
Entropy (Basel). 2024 Feb 28;26(3):210. doi: 10.3390/e26030210.
Recent years have seen a rise in interest in document-level relation extraction, which is defined as extracting all relations between entities in multiple sentences of a document. Typically, there are multiple mentions corresponding to a single entity in this context. Previous research predominantly employed a holistic representation for each entity to predict relations, but this approach often overlooks valuable information contained in fine-grained entity mentions. We contend that relation prediction and inference should be grounded in specific entity mentions rather than abstract entity concepts. To address this, our paper proposes a two-stage mention-level framework based on an enhanced heterogeneous graph attention network for document-level relation extraction. Our framework employs two different strategies to model intra-sentential and inter-sentential relations between fine-grained entity mentions, yielding local mention representations for intra-sentential relation prediction and global mention representations for inter-sentential relation prediction. For inter-sentential relation prediction and inference, we propose an enhanced heterogeneous graph attention network to better model the long-distance semantic relationships and design an entity-coreference path-based inference strategy to conduct relation inference. Moreover, we introduce a novel cross-entropy-based multilabel focal loss function to address the class imbalance problem and multilabel prediction simultaneously. Comprehensive experiments have been conducted to verify the effectiveness of our framework. Experimental results show that our approach significantly outperforms the existing methods.
近年来,文档级关系抽取受到越来越多的关注,它被定义为在文档的多个句子中提取实体之间的所有关系。在这种情况下,通常一个实体有多个提及。先前的研究主要采用每个实体的整体表示来预测关系,但这种方法往往忽略了细粒度实体提及中包含的有价值信息。我们认为关系预测和推理应该基于特定的实体提及,而不是抽象的实体概念。为了解决这个问题,我们的论文提出了一种基于增强型异构图注意力网络的两阶段提及级框架,用于文档级关系抽取。我们的框架采用两种不同的策略来建模细粒度实体提及之间的句内和句间关系,生成用于句内关系预测的局部提及表示和用于句间关系预测的全局提及表示。对于句间关系预测和推理,我们提出了一种增强型异构图注意力网络来更好地建模长距离语义关系,并设计了一种基于实体共指路径的推理策略来进行关系推理。此外,我们引入了一种新颖的基于交叉熵的多标签焦点损失函数,以同时解决类不平衡问题和多标签预测问题。我们进行了全面的实验来验证我们框架的有效性。实验结果表明,我们的方法明显优于现有方法。