College of Computer Science and Technology, Dalian University of Technology, Dalian 116023, China.
College of Computer Science and Technology, Dalian University of Technology, Dalian 116023, China.
J Biomed Inform. 2018 May;81:83-92. doi: 10.1016/j.jbi.2018.03.011. Epub 2018 Mar 27.
Biomedical relation extraction can automatically extract high-quality biomedical relations from biomedical texts, which is a vital step for the mining of biomedical knowledge hidden in the literature. Recurrent neural networks (RNNs) and convolutional neural networks (CNNs) are two major neural network models for biomedical relation extraction. Neural network-based methods for biomedical relation extraction typically focus on the sentence sequence and employ RNNs or CNNs to learn the latent features from sentence sequences separately. However, RNNs and CNNs have their own advantages for biomedical relation extraction. Combining RNNs and CNNs may improve biomedical relation extraction. In this paper, we present a hybrid model for the extraction of biomedical relations that combines RNNs and CNNs. First, the shortest dependency path (SDP) is generated based on the dependency graph of the candidate sentence. To make full use of the SDP, we divide the SDP into a dependency word sequence and a relation sequence. Then, RNNs and CNNs are employed to automatically learn the features from the sentence sequence and the dependency sequences, respectively. Finally, the output features of the RNNs and CNNs are combined to detect and extract biomedical relations. We evaluate our hybrid model using five public (protein-protein interaction) PPI corpora and a (drug-drug interaction) DDI corpus. The experimental results suggest that the advantages of RNNs and CNNs in biomedical relation extraction are complementary. Combining RNNs and CNNs can effectively boost biomedical relation extraction performance.
生物医学关系抽取可以自动从生物医学文本中提取高质量的生物医学关系,这是挖掘文献中隐藏的生物医学知识的重要步骤。递归神经网络 (RNN) 和卷积神经网络 (CNN) 是生物医学关系抽取的两个主要神经网络模型。基于神经网络的生物医学关系抽取方法通常侧重于句子序列,并分别使用 RNN 或 CNN 从句子序列中学习潜在特征。然而,RNN 和 CNN 在生物医学关系抽取方面各有优势。结合 RNN 和 CNN 可能会提高生物医学关系抽取的性能。在本文中,我们提出了一种结合 RNN 和 CNN 的生物医学关系抽取混合模型。首先,基于候选句子的依存关系图生成最短依存路径 (SDP)。为了充分利用 SDP,我们将 SDP 分为依存词序列和关系序列。然后,分别使用 RNN 和 CNN 自动从句子序列和依存序列中学习特征。最后,将 RNN 和 CNN 的输出特征结合起来检测和提取生物医学关系。我们使用五个公共 (蛋白质-蛋白质相互作用) PPI 语料库和一个 (药物-药物相互作用) DDI 语料库来评估我们的混合模型。实验结果表明,RNN 和 CNN 在生物医学关系抽取方面的优势是互补的。结合 RNN 和 CNN 可以有效地提高生物医学关系抽取的性能。