• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于深度神经网络模型的生物医学文本因果关系抽取:全面综述。

Causal relationship extraction from biomedical text using deep neural models: A comprehensive survey.

机构信息

Language Intelligence and Information Retrieval Lab, KU Leuven, Belgium; Department of Computer Science, Celestijnenlaan 200 A, Leuven, Belgium.

出版信息

J Biomed Inform. 2021 Jul;119:103820. doi: 10.1016/j.jbi.2021.103820. Epub 2021 May 24.

DOI:10.1016/j.jbi.2021.103820
PMID:34044157
Abstract

The identification of causal relationships between events or entities within biomedical texts is of great importance for creating scientific knowledge bases and is also a fundamental natural language processing (NLP) task. A causal (cause-effect) relation is defined as an association between two events in which the first must occur before the second. Although this task is an open problem in artificial intelligence, and despite its important role in information extraction from the biomedical literature, very few works have considered this problem. However, with the advent of new techniques in machine learning, especially deep neural networks, research increasingly addresses this problem. This paper summarizes state-of-the-art research, its applications, existing datasets, and remaining challenges. For this survey we have implemented and evaluated various techniques including a Multiview CNN (MVC), attention-based BiLSTM models and state-of-the-art word embedding models, such as those obtained with bidirectional encoder representations (ELMo) and transformer architectures (BioBERT). In addition, we have evaluated a graph LSTM as well as a baseline rule based system. We have investigated the class imbalance problem as an innate property of annotated data in this type of task. The results show that a considerable improvement of the results of state-of-the-art systems can be achieved when a simple random oversampling technique for data augmentation is used in order to reduce class imbalance.

摘要

在生物医学文本中识别事件或实体之间的因果关系对于创建科学知识库非常重要,也是自然语言处理 (NLP) 的基本任务。因果关系(因果关系)定义为两个事件之间的关联,其中第一个事件必须先于第二个事件发生。尽管这个任务在人工智能中是一个开放性问题,尽管它在从生物医学文献中提取信息方面具有重要作用,但很少有作品考虑过这个问题。然而,随着机器学习新技术的出现,特别是深度学习神经网络的出现,研究越来越多地解决了这个问题。本文总结了最新的研究、应用、现有数据集和遗留挑战。为此调查,我们实现和评估了各种技术,包括多视图卷积神经网络 (MVC)、基于注意力的 BiLSTM 模型和最先进的单词嵌入模型,例如使用双向编码器表示 (ELMo) 和转换器架构 (BioBERT) 获得的模型。此外,我们还评估了图 LSTM 和基于规则的基线系统。我们研究了这种任务中注释数据固有的类不平衡问题。结果表明,当使用简单的随机过采样技术进行数据增强以减少类不平衡时,当前最先进系统的结果可以得到相当大的改进。

相似文献

1
Causal relationship extraction from biomedical text using deep neural models: A comprehensive survey.基于深度神经网络模型的生物医学文本因果关系抽取:全面综述。
J Biomed Inform. 2021 Jul;119:103820. doi: 10.1016/j.jbi.2021.103820. Epub 2021 May 24.
2
Comparing deep learning architectures for sentiment analysis on drug reviews.比较药物评论情感分析的深度学习架构。
J Biomed Inform. 2020 Oct;110:103539. doi: 10.1016/j.jbi.2020.103539. Epub 2020 Aug 17.
3
Entity recognition from clinical texts via recurrent neural network.基于循环神经网络的临床文本实体识别。
BMC Med Inform Decis Mak. 2017 Jul 5;17(Suppl 2):67. doi: 10.1186/s12911-017-0468-7.
4
Combinatorial feature embedding based on CNN and LSTM for biomedical named entity recognition.基于 CNN 和 LSTM 的组合特征嵌入的生物医学命名实体识别。
J Biomed Inform. 2020 Mar;103:103381. doi: 10.1016/j.jbi.2020.103381. Epub 2020 Jan 28.
5
BioBERT and Similar Approaches for Relation Extraction.BioBERT 及其在关系抽取中的应用。
Methods Mol Biol. 2022;2496:221-235. doi: 10.1007/978-1-0716-2305-3_12.
6
BioBERT: a pre-trained biomedical language representation model for biomedical text mining.BioBERT:一种用于生物医学文本挖掘的预训练生物医学语言表示模型。
Bioinformatics. 2020 Feb 15;36(4):1234-1240. doi: 10.1093/bioinformatics/btz682.
7
Relation Classification for Bleeding Events From Electronic Health Records Using Deep Learning Systems: An Empirical Study.使用深度学习系统对电子健康记录中的出血事件进行关系分类:一项实证研究。
JMIR Med Inform. 2021 Jul 2;9(7):e27527. doi: 10.2196/27527.
8
Extracting medications and associated adverse drug events using a natural language processing system combining knowledge base and deep learning.利用结合知识库和深度学习的自然语言处理系统提取药物和相关药物不良事件。
J Am Med Inform Assoc. 2020 Jan 1;27(1):56-64. doi: 10.1093/jamia/ocz141.
9
LBERT: Lexically aware Transformer-based Bidirectional Encoder Representation model for learning universal bio-entity relations.LBERT:基于词汇感知的基于Transformer的双向编码器表示模型,用于学习通用生物实体关系。
Bioinformatics. 2021 Apr 20;37(3):404-412. doi: 10.1093/bioinformatics/btaa721.
10
Automatic extraction of cancer registry reportable information from free-text pathology reports using multitask convolutional neural networks.使用多任务卷积神经网络从自由文本病理报告中自动提取癌症登记报告信息。
J Am Med Inform Assoc. 2020 Jan 1;27(1):89-98. doi: 10.1093/jamia/ocz153.

引用本文的文献

1
What are we learning with Yoga? Mapping the scientific literature on Yoga using a vector-text-mining approach.我们通过瑜伽学到了什么?使用向量文本挖掘方法绘制关于瑜伽的科学文献图谱。
PLoS One. 2025 May 29;20(5):e0322791. doi: 10.1371/journal.pone.0322791. eCollection 2025.
2
Large language model based framework for automated extraction of genetic interactions from unstructured data.基于大型语言模型的框架,用于从非结构化数据中自动提取遗传相互作用。
PLoS One. 2024 May 21;19(5):e0303231. doi: 10.1371/journal.pone.0303231. eCollection 2024.
3
Searching for LINCS to Stress: Using Text Mining to Automate Reference Chemical Curation.
寻找 LINCS 应激反应:利用文本挖掘技术实现参考化学物质的自动编目。
Chem Res Toxicol. 2024 Jun 17;37(6):878-893. doi: 10.1021/acs.chemrestox.3c00335. Epub 2024 May 13.
4
Evidence-based clinical engineering: Health information technology adverse events identification and classification with natural language processing.循证临床工程:利用自然语言处理技术进行健康信息技术不良事件的识别与分类
Heliyon. 2023 Oct 31;9(11):e21723. doi: 10.1016/j.heliyon.2023.e21723. eCollection 2023 Nov.
5
Using transfer learning-based causality extraction to mine latent factors for Sjögren's syndrome from biomedical literature.利用基于迁移学习的因果关系提取从生物医学文献中挖掘干燥综合征的潜在因素。
Heliyon. 2023 Aug 22;9(9):e19265. doi: 10.1016/j.heliyon.2023.e19265. eCollection 2023 Sep.
6
Discovering causal paths to diabetic nephropathy by combining computable biomedical knowledge with graph mining algorithms.通过将可计算的生物医学知识与图挖掘算法相结合,发现通向糖尿病肾病的因果路径。
AMIA Annu Symp Proc. 2023 Apr 29;2022:1118-1124. eCollection 2022.
7
Impact of word embedding models on text analytics in deep learning environment: a review.词嵌入模型对深度学习环境下文本分析的影响:综述
Artif Intell Rev. 2023 Feb 22:1-81. doi: 10.1007/s10462-023-10419-1.
8
Entity relation extraction in the medical domain: based on data augmentation.医学领域中的实体关系提取:基于数据增强
Ann Transl Med. 2022 Oct;10(19):1061. doi: 10.21037/atm-22-3991.
9
Combining metabolome and clinical indicators with machine learning provides some promising diagnostic markers to precisely detect smear-positive/negative pulmonary tuberculosis.将代谢组学和临床指标与机器学习相结合,为精确检测菌阳/菌阴肺结核提供了一些有前途的诊断标志物。
BMC Infect Dis. 2022 Aug 25;22(1):707. doi: 10.1186/s12879-022-07694-8.
10
Development and prospective validation of COVID-19 chest X-ray screening model for patients attending emergency departments.急诊科就诊患者COVID-19胸部X线筛查模型的开发与前瞻性验证
Sci Rep. 2021 Oct 14;11(1):20384. doi: 10.1038/s41598-021-99986-3.