Suppr超能文献

基于注意力的远程监督的化学诱导疾病关系抽取。

Chemical-induced disease relation extraction via attention-based distant supervision.

机构信息

Natural Language Processing Lab, School of Computer Science and Technology, Soochow University, 1 Shizi Street, Suzhou, China.

Big Data Group, Baidu Inc., Beijing, China.

出版信息

BMC Bioinformatics. 2019 Jul 22;20(1):403. doi: 10.1186/s12859-019-2884-4.

Abstract

BACKGROUND

Automatically understanding chemical-disease relations (CDRs) is crucial in various areas of biomedical research and health care. Supervised machine learning provides a feasible solution to automatically extract relations between biomedical entities from scientific literature, its success, however, heavily depends on large-scale biomedical corpora manually annotated with intensive labor and tremendous investment.

RESULTS

We present an attention-based distant supervision paradigm for the BioCreative-V CDR extraction task. Training examples at both intra- and inter-sentence levels are generated automatically from the Comparative Toxicogenomics Database (CTD) without any human intervention. An attention-based neural network and a stacked auto-encoder network are applied respectively to induce learning models and extract relations at both levels. After merging the results of both levels, the document-level CDRs can be finally extracted. It achieves the precision/recall/F1-score of 60.3%/73.8%/66.4%, outperforming the state-of-the-art supervised learning systems without using any annotated corpus.

CONCLUSION

Our experiments demonstrate that distant supervision is promising for extracting chemical disease relations from biomedical literature, and capturing both local and global attention features simultaneously is effective in attention-based distantly supervised learning.

摘要

背景

自动理解化学-疾病关系(CDR)在生物医学研究和医疗保健的各个领域都至关重要。监督机器学习为自动从科学文献中提取生物医学实体之间的关系提供了一种可行的解决方案,但其成功在很大程度上取决于大规模的生物医学语料库,这些语料库需要大量的人工标注和投资。

结果

我们提出了一种基于注意力的远距离监督范式,用于生物创意-V CDR 提取任务。在没有任何人工干预的情况下,从比较毒理学基因组数据库(CTD)自动生成了句子内和句子间的训练示例。分别应用基于注意力的神经网络和堆叠自动编码器网络来分别在两个层次上诱导学习模型并提取关系。在合并两个层次的结果后,最终可以提取文档级别的 CDR。它实现了 60.3%/73.8%/66.4%的精度/召回率/F1 分数,优于不使用任何标注语料库的最先进的监督学习系统。

结论

我们的实验表明,远距离监督在从生物医学文献中提取化学疾病关系方面具有很大的潜力,同时捕捉局部和全局注意力特征在基于注意力的远距离监督学习中是有效的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0813/6647285/b1a67aea8f09/12859_2019_2884_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验