Li Zhijing, Tian Liwei, Jiang Yiping, Huang Yucheng
IEEE/ACM Trans Comput Biol Bioinform. 2024 Nov-Dec;21(6):2156-2166. doi: 10.1109/TCBB.2024.3451348. Epub 2024 Dec 10.
Relation extraction, a crucial task in understanding the intricate relationships between entities in biomedical domains, has predominantly focused on binary relations within single sentences. However, in practical biomedical scenarios, relationships often extend across multiple sentences, leading to extraction errors with potential impacts on clinical decision-making and medical diagnosis. To overcome this limitation, we present a novel cross-sentence relation extraction framework that integrates and enhances coreference resolution and relation extraction models. Coreference resolution serves as the foundation, breaking sentence boundaries and linking entities across sentences. Our framework incorporates pre-trained deep language representations and leverages graph LSTMs to effectively model cross-sentence entity mentions. The use of a self-attentive Transformer architecture and external semantic information further enhances the modeling of intricate relationships. Comprehensive experiments conducted on two standard datasets, namely the BioNLP dataset and THYME dataset, demonstrate the state-of-the-art performance of our proposed approach.
关系抽取是理解生物医学领域实体间复杂关系的一项关键任务,主要聚焦于单句内的二元关系。然而,在实际的生物医学场景中,关系常常跨越多个句子,这会导致抽取错误,对临床决策和医学诊断产生潜在影响。为克服这一局限,我们提出了一种新颖的跨句关系抽取框架,该框架整合并增强了共指消解和关系抽取模型。共指消解作为基础,打破句子边界并跨句链接实体。我们的框架纳入了预训练的深度语言表示,并利用图长短期记忆网络(Graph LSTM)来有效地对跨句实体提及进行建模。自注意力Transformer架构和外部语义信息的使用进一步增强了对复杂关系的建模。在两个标准数据集,即生物自然语言处理(BioNLP)数据集和时间信息提取(THYME)数据集上进行的综合实验证明了我们所提方法的领先性能。