Sun Xia, Dong Ke, Ma Long, Sutcliffe Richard, He Feijuan, Chen Sushing, Feng Jun
Department of Information Science and Technology, Northwest University, Xi'an 710127, China.
Department of Computer Science, Xi'an Jiaotong University City College, Xi'an 710069, China.
Entropy (Basel). 2019 Jan 8;21(1):37. doi: 10.3390/e21010037.
Drug-drug interactions (DDIs) may bring huge health risks and dangerous effects to a patient's body when taking two or more drugs at the same time or within a certain period of time. Therefore, the automatic extraction of unknown DDIs has great potential for the development of pharmaceutical agents and the safety of drug use. In this article, we propose a novel recurrent hybrid convolutional neural network (RHCNN) for DDI extraction from biomedical literature. In the embedding layer, the texts mentioning two entities are represented as a sequence of semantic embeddings and position embeddings. In particular, the complete semantic embedding is obtained by the information fusion between a word embedding and its contextual information which is learnt by recurrent structure. After that, the hybrid convolutional neural network is employed to learn the sentence-level features which consist of the local context features from consecutive words and the dependency features between separated words for DDI extraction. Lastly but most significantly, in order to make up for the defects of the traditional cross-entropy loss function when dealing with class imbalanced data, we apply an improved focal loss function to mitigate against this problem when using the DDIExtraction 2013 dataset. In our experiments, we achieve DDI automatic extraction with a micro F-score of 75.48% on the DDIExtraction 2013 dataset, outperforming the state-of-the-art approach by 2.49%.
药物相互作用(DDIs)在患者同时服用两种或更多种药物或在特定时间段内服用多种药物时,可能会给患者身体带来巨大的健康风险和危险影响。因此,未知药物相互作用的自动提取对于药物研发和用药安全具有巨大的发展潜力。在本文中,我们提出了一种新颖的循环混合卷积神经网络(RHCNN),用于从生物医学文献中提取药物相互作用。在嵌入层中,提及两个实体的文本被表示为语义嵌入和位置嵌入的序列。具体而言,完整的语义嵌入是通过词嵌入与其通过循环结构学习到的上下文信息之间的信息融合获得的。之后,采用混合卷积神经网络来学习句子级特征,这些特征包括连续单词的局部上下文特征和用于药物相互作用提取的分离单词之间的依存特征。最后但同样重要的是,为了弥补传统交叉熵损失函数在处理类不平衡数据时的缺陷,我们在使用DDIExtraction 2013数据集时应用了改进的焦点损失函数来缓解这个问题。在我们的实验中,我们在DDIExtraction 2从生物医学文献中提取药物相互作用。在嵌入层中,提及两个实体的文本被表示为语义嵌入和位置嵌入的序列。具体而言,完整的语义嵌入是通过词嵌入与其通过循环结构学习到的上下文信息之间的信息融合获得的。之后,采用混合卷积神经网络来学习句子级特征,这些特征包括连续单词的局部上下文特征和用于药物相互作用提取的分离单词之间的依存特征。最后但同样重要的是,为了弥补传统交叉熵损失函数在处理类不平衡数据时的缺陷,我们在使用DDIExtraction 2013数据集时应用了改进的焦点损失函数来缓解这个问题。在我们的实验中,我们在DDIExtraction 2013数据集上实现了药物相互作用的自动提取,微F值为75.48%,比最先进的方法高出2.49%。