School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China.
School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China.
J Biomed Inform. 2022 Jan;125:103956. doi: 10.1016/j.jbi.2021.103956. Epub 2021 Nov 27.
Extracting entities and their relations from unstructured literature to form structured triplets is essential for biomedical knowledge extraction. Because sentences in biomedical datasets usually have many special overlapping triplets, it is difficult to use previous work to extract these triplets effectively. In this work, we propose a novel tagging strategy to achieve joint extraction in the machine reading comprehension framework. On the one hand, our method uses Query in the machine reading comprehension framework to introduce the information of the specific relation. On the other hand, our method introduces a tagging strategy for overlapping triplets in the biomedical domain. We use CHEMPROT and DDIExtraction2013 datasets to evaluate our method. The experimental results demonstrate that our proposed method can enhance the model's ability to deal with overlapping triplets, improving extraction performance.
从非结构化文献中提取实体及其关系以形成结构化三元组对于生物医学知识提取至关重要。由于生物医学数据集的句子通常具有许多特殊的重叠三元组,因此很难使用以前的工作来有效地提取这些三元组。在这项工作中,我们提出了一种新的标记策略,以在机器阅读理解框架中实现联合提取。一方面,我们的方法在机器阅读理解框架中使用 Query 引入特定关系的信息。另一方面,我们的方法在生物医学领域引入了重叠三元组的标记策略。我们使用 CHEMPROT 和 DDIExtraction2013 数据集来评估我们的方法。实验结果表明,我们提出的方法可以增强模型处理重叠三元组的能力,从而提高提取性能。