Zheng Wei, Lin Hongfei, Luo Ling, Zhao Zhehuan, Li Zhengguang, Zhang Yijia, Yang Zhihao, Wang Jian
College of Computer Science and Technology, Dalian University of Technology, Dalian, China.
College of Software, Dalian JiaoTong University, Dalian, China.
BMC Bioinformatics. 2017 Oct 10;18(1):445. doi: 10.1186/s12859-017-1855-x.
Drug-drug interactions (DDIs) often bring unexpected side effects. The clinical recognition of DDIs is a crucial issue for both patient safety and healthcare cost control. However, although text-mining-based systems explore various methods to classify DDIs, the classification performance with regard to DDIs in long and complex sentences is still unsatisfactory.
In this study, we propose an effective model that classifies DDIs from the literature by combining an attention mechanism and a recurrent neural network with long short-term memory (LSTM) units. In our approach, first, a candidate-drug-oriented input attention acting on word-embedding vectors automatically learns which words are more influential for a given drug pair. Next, the inputs merging the position- and POS-embedding vectors are passed to a bidirectional LSTM layer whose outputs at the last time step represent the high-level semantic information of the whole sentence. Finally, a softmax layer performs DDI classification.
Experimental results from the DDIExtraction 2013 corpus show that our system performs the best with respect to detection and classification (84.0% and 77.3%, respectively) compared with other state-of-the-art methods. In particular, for the Medline-2013 dataset with long and complex sentences, our F-score far exceeds those of top-ranking systems by 12.6%.
Our approach effectively improves the performance of DDI classification tasks. Experimental analysis demonstrates that our model performs better with respect to recognizing not only close-range but also long-range patterns among words, especially for long, complex and compound sentences.
药物相互作用(DDIs)常常带来意想不到的副作用。DDIs的临床识别对于患者安全和医疗成本控制而言都是一个关键问题。然而,尽管基于文本挖掘的系统探索了各种方法来对DDIs进行分类,但对于长而复杂句子中的DDIs,其分类性能仍不尽人意。
在本研究中,我们提出了一种有效的模型,该模型通过将注意力机制与带有长短期记忆(LSTM)单元的递归神经网络相结合,从文献中对DDIs进行分类。在我们的方法中,首先,作用于词嵌入向量的面向候选药物的输入注意力会自动学习哪些词对给定的药物对更具影响力。接下来,将合并了位置和词性嵌入向量的输入传递到双向LSTM层,其在最后一个时间步的输出代表整个句子的高级语义信息。最后,一个softmax层执行DDI分类。
来自DDIExtraction 2013语料库的实验结果表明,与其他最先进的方法相比,我们的系统在检测和分类方面表现最佳(分别为84.0%和77.3%)。特别是对于带有长而复杂句子的Medline - 2013数据集,我们的F分数比排名靠前的系统高出12.6%。
我们的方法有效地提高了DDI分类任务的性能。实验分析表明,我们的模型在识别单词之间的近距离和远距离模式方面表现更好,尤其是对于长、复杂和复合句子。