基于注意力的双向长短时记忆网络，用于从临床出院小结中提取时间关系。

Attention-based bidirectional long short-term memory networks for extracting temporal relationships from clinical discharge summaries.

机构信息

Department of Computer Science, University of Manchester, Manchester, UK; Department of Computer Science, Jamoum University College, Umm Al-Qura University, Makkah, Saudi Arabia.

Centre for Health Informatics, Division of Informatics, Imaging and Data Sciences, University of Manchester, Manchester, UK; National Institute of Health Research Manchester Biomedical Research Centre, Manchester Academic Health Science Centre, University of Manchester, Manchester, UK; The Alan Turing Institute, UK.

出版信息

J Biomed Inform. 2021 Nov;123:103915. doi: 10.1016/j.jbi.2021.103915. Epub 2021 Sep 29.

DOI:10.1016/j.jbi.2021.103915

PMID:34600144

Abstract

Temporal relation extraction between health-related events is a widely studied task in clinical Natural Language Processing (NLP). The current state-of-the-art methods mostly rely on engineered features (i.e., rule-based modelling) and sequence modelling, which often encodes a source sentence into a single fixed-length context. An obvious disadvantage of this fixed-length context design is its incapability to model longer sentences, as important temporal information in the clinical text may appear at different positions. To address this issue, we propose an Attention-based Bidirectional Long Short-Term Memory (Att-BiLSTM) model to enable learning the important semantic information in long source text segments and to better determine which parts of the text are most important. We experimented with two embeddings and compared the performances to traditional state-of-the-art methods that require elaborate linguistic pre-processing and hand-engineered features. The experimental results on the i2b2 2012 temporal relation test corpus show that the proposed method achieves a significant improvement with an F-score of 0.811, which is at least 10% better than state-of-the-art in the field. We show that the model can be remarkably effective at classifying temporal relations when provided with word embeddings trained on corpora in a general domain. Finally, we perform an error analysis to gain insight into the common errors made by the model.

摘要

健康相关事件之间的时间关系抽取是临床自然语言处理（NLP）中广泛研究的任务。当前最先进的方法主要依赖于工程特征（即基于规则的建模）和序列建模，这些方法通常将源句子编码为单个固定长度的上下文。这种固定长度上下文设计的一个明显缺点是它无法对长句子进行建模，因为临床文本中的重要时间信息可能出现在不同的位置。为了解决这个问题，我们提出了一种基于注意力的双向长短时记忆（Att-BiLSTM）模型，以能够学习长源文本段中的重要语义信息，并更好地确定文本的哪些部分最重要。我们尝试了两种嵌入方法，并将性能与需要精心语言预处理和手工工程特征的传统最先进方法进行了比较。在 i2b2 2012 时间关系测试语料库上的实验结果表明，所提出的方法在 F 分数为 0.811 时取得了显著的改进，至少比该领域的最先进方法好 10%。我们表明，当提供在通用领域的语料库上训练的词嵌入时，该模型可以在分类时间关系时非常有效。最后，我们进行错误分析以深入了解模型犯的常见错误。

相似文献

Attention-based bidirectional long short-term memory networks for extracting temporal relationships from clinical discharge summaries.

J Biomed Inform. 2021 Nov;123:103915. doi: 10.1016/j.jbi.2021.103915. Epub 2021 Sep 29.

Extracting Drug Names and Associated Attributes From Discharge Summaries: Text Mining Study.

JMIR Med Inform. 2021 May 5;9(5):e24678. doi: 10.2196/24678.

A comparison of word embeddings for the biomedical natural language processing.

J Biomed Inform. 2018 Nov;87:12-20. doi: 10.1016/j.jbi.2018.09.008. Epub 2018 Sep 12.

Extraction of temporal relations from clinical free text: A systematic review of current approaches.

J Biomed Inform. 2020 Aug;108:103488. doi: 10.1016/j.jbi.2020.103488. Epub 2020 Jul 13.

Automatic Correction of Real-Word Errors in Spanish Clinical Texts.

Sensors (Basel). 2021 Apr 21;21(9):2893. doi: 10.3390/s21092893.

Extracting comprehensive clinical information for breast cancer using deep learning methods.

Int J Med Inform. 2019 Dec;132:103985. doi: 10.1016/j.ijmedinf.2019.103985. Epub 2019 Oct 2.

Paraphrase detection for Urdu language text using fine-tune BiLSTM framework.

Sci Rep. 2025 May 2;15(1):15383. doi: 10.1038/s41598-025-93260-6.

CapsTM: capsule network for Chinese medical text matching.

BMC Med Inform Decis Mak. 2021 Jul 30;21(Suppl 2):94. doi: 10.1186/s12911-021-01442-9.

Integrating shortest dependency path and sentence sequence into a deep learning framework for relation extraction in clinical text.

BMC Med Inform Decis Mak. 2019 Jan 31;19(Suppl 1):22. doi: 10.1186/s12911-019-0736-9.

Enhancing clinical concept extraction with contextual embeddings.

J Am Med Inform Assoc. 2019 Nov 1;26(11):1297-1304. doi: 10.1093/jamia/ocz096.

引用本文的文献

CalTrig: A GUI-based Machine Learning Approach for Decoding Neuronal Calcium Transients in Freely Moving Rodents.

eNeuro. 2025 Jul 2. doi: 10.1523/ENEURO.0009-25.2025.

Decoding Recurrence in Early-Stage and Locoregionally Advanced Non-Small Cell Lung Cancer: Insights From Electronic Health Records and Natural Language Processing.

JCO Clin Cancer Inform. 2025 Apr;9:e2400227. doi: 10.1200/CCI-24-00227. Epub 2025 Apr 18.

CalTrig: A GUI-based Machine Learning Approach for Decoding Neuronal Calcium Transients in Freely Moving Rodents.

bioRxiv. 2024 Nov 19:2024.09.30.615860. doi: 10.1101/2024.09.30.615860.

Optimizing Clinical Trial Eligibility Design Using Natural Language Processing Models and Real-World Data: Algorithm Development and Validation.

JMIR AI. 2024 Jul 29;3:e50800. doi: 10.2196/50800.

Detecting Ground Glass Opacity Features in Patients With Lung Cancer: Automated Extraction and Longitudinal Analysis via Deep Learning-Based Natural Language Processing.

JMIR AI. 2023 Jun 1;2:e44537. doi: 10.2196/44537.

Year 2021: COVID-19, Information Extraction and BERTization among the Hottest Topics in Medical Natural Language Processing.

Yearb Med Inform. 2022 Aug;31(1):254-260. doi: 10.1055/s-0042-1742547. Epub 2022 Dec 4.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于注意力的双向长短时记忆网络，用于从临床出院小结中提取时间关系。

Attention-based bidirectional long short-term memory networks for extracting temporal relationships from clinical discharge summaries.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献