IEEE/ACM Trans Comput Biol Bioinform. 2019 Nov-Dec;16(6):1879-1889. doi: 10.1109/TCBB.2018.2838661. Epub 2018 May 21.
Automatically extracting the relationships between chemicals and diseases is significantly important to various areas of biomedical research and health care. Biomedical experts have built many large-scale knowledge bases (KBs) to advance the development of biomedical research. KBs contain huge amounts of structured information about entities and relationships, therefore plays a pivotal role in chemical-disease relation (CDR) extraction. However, previous researches pay less attention to the prior knowledge existing in KBs. This paper proposes a neural network-based attention model (NAM) for CDR extraction, which makes full use of context information in documents and prior knowledge in KBs. For a pair of entities in a document, an attention mechanism is employed to select important context words with respect to the relation representations learned from KBs. Experiments on the BioCreative V CDR dataset show that combining context and knowledge representations through the attention mechanism, could significantly improve the CDR extraction performance while achieve comparable results with state-of-the-art systems.
自动提取化学物质与疾病之间的关系,对生物医学研究和医疗保健的各个领域都具有重要意义。生物医学专家构建了许多大型知识库 (KB),以推动生物医学研究的发展。KB 包含大量有关实体和关系的结构化信息,因此在化学-疾病关系 (CDR) 提取中起着关键作用。然而,以前的研究对知识库中存在的先验知识关注较少。本文提出了一种基于神经网络的注意力模型 (NAM),用于 CDR 提取,该模型充分利用了文档中的上下文信息和 KB 中的先验知识。对于文档中的一对实体,我们使用注意力机制选择与从 KB 学习到的关系表示相对应的重要上下文词。在 BioCreative V CDR 数据集上的实验表明,通过注意力机制将上下文和知识表示结合起来,可以显著提高 CDR 提取性能,同时获得与最先进系统相当的结果。