Li Yuqing, Shao Xinhui
Department of Mathematics, College of Sciences, Northeastern University, Shenyang, China.
Department of Mathematics, College of Sciences, Northeastern University, Shenyang, China.
J Biomed Inform. 2024 Dec;160:104756. doi: 10.1016/j.jbi.2024.104756. Epub 2024 Nov 30.
In contrast to sentence-level relational extraction, document-level relation extraction poses greater challenges as a document typically contains multiple entities, and one entity may be associated with multiple other entities. Existing methods often rely on graph structures to capture path representations between entity pairs. However, this paper introduces a novel approach called local entity pooling that solely relies on the pre-training model to identify the bridge entity related to the current entity pair and generate the reasoning path representation. This technique effectively mitigates the multi-entity problem. Additionally, the model leverages the multi-entity and multi-label characteristics of the document to acquire the document's thematic representation, thereby enhancing the document-level relation extraction task. Experimental evaluations conducted on two biomedical datasets, CDR and GDA. Our TCLEP (Thematic Capture and Localized Entity Pooling) model achieved the Macro-F1 scores of 71.7% and 85.3%, respectively. Simultaneously, we incorporated local entity pooling and thematic capture modules into the state-of-the-art model, resulting in performance improvements of 1.5% and 0.2% on the respective datasets. These results highlight the advanced performance of our proposed approach.
与句子级关系抽取相比,文档级关系抽取带来了更大的挑战,因为文档通常包含多个实体,并且一个实体可能与多个其他实体相关联。现有方法通常依赖图结构来捕获实体对之间的路径表示。然而,本文介绍了一种名为局部实体池化的新颖方法,该方法仅依靠预训练模型来识别与当前实体对相关的桥梁实体并生成推理路径表示。这种技术有效地缓解了多实体问题。此外,该模型利用文档的多实体和多标签特征来获取文档的主题表示,从而增强文档级关系抽取任务。在两个生物医学数据集CDR和GDA上进行了实验评估。我们的TCLEP(主题捕获和局部实体池化)模型分别取得了71.7%和85.3%的宏F1分数。同时,我们将局部实体池化和主题捕获模块纳入到最先进的模型中,在各自的数据集上分别带来了1.5%和0.2%的性能提升。这些结果突出了我们所提出方法的先进性能。