School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China.
J Biomed Inform. 2024 Aug;156:104676. doi: 10.1016/j.jbi.2024.104676. Epub 2024 Jun 12.
Biomedical relation extraction has long been considered a challenging task due to the specialization and complexity of biomedical texts. Syntactic knowledge has been widely employed in existing research to enhance relation extraction, providing guidance for the semantic understanding and text representation of models. However, the utilization of syntactic knowledge in most studies is not exhaustive, and there is often a lack of fine-grained noise reduction, leading to confusion in relation classification. In this paper, we propose an attention generator that comprehensively considers both syntactic dependency type information and syntactic position information to distinguish the importance of different dependency connections. Additionally, we integrate positional information, dependency type information, and word representations together to introduce location-enhanced syntactic knowledge for guiding our biomedical relation extraction. Experimental results on three widely used English benchmark datasets in the biomedical domain consistently outperform a range of baseline models, demonstrating that our approach not only makes full use of syntactic knowledge but also effectively reduces the impact of noisy words.
生物医学关系抽取长期以来一直被认为是一项具有挑战性的任务,这主要是由于生物医学文本的专业性和复杂性所致。在现有的研究中,语法知识被广泛应用于关系抽取,为模型的语义理解和文本表示提供了指导。然而,大多数研究中对语法知识的利用并不全面,并且通常缺乏精细的降噪,从而导致关系分类混淆。在本文中,我们提出了一种注意力生成器,该生成器全面考虑了语法依存类型信息和语法位置信息,以区分不同依存关系的重要性。此外,我们将位置信息、依存类型信息和单词表示集成在一起,引入位置增强型语法知识来指导我们的生物医学关系抽取。在生物医学领域三个广泛使用的英语基准数据集上的实验结果始终优于一系列基线模型,这表明我们的方法不仅充分利用了语法知识,而且有效地减少了噪声词的影响。