基于本体感知深度学习的医学缩略语自动消歧

Automatically disambiguating medical acronyms with ontology-aware deep learning.

机构信息

Department of Computer Science, University of Toronto, Toronto, Canada.

DATA Team & Techna Institute, University Health Network, Toronto, Canada.

出版信息

Nat Commun. 2021 Sep 7;12(1):5319. doi: 10.1038/s41467-021-25578-4.

DOI:10.1038/s41467-021-25578-4

PMID:34493718

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8423722/

Abstract

Modern machine learning (ML) technologies have great promise for automating diverse clinical and research workflows; however, training them requires extensive hand-labelled datasets. Disambiguating abbreviations is important for automated clinical note processing; however, broad deployment of ML for this task is restricted by the scarcity and imbalance of labeled training data. In this work we present a method that improves a model's ability to generalize through novel data augmentation techniques that utilizes information from biomedical ontologies in the form of related medical concepts, as well as global context information within the medical note. We train our model on a public dataset (MIMIC III) and test its performance on automatically generated and hand-labelled datasets from different sources (MIMIC III, CASI, i2b2). Together, these techniques boost the accuracy of abbreviation disambiguation by up to 17% on hand-labeled data, without sacrificing performance on a held-out test set from MIMIC III.

摘要

现代机器学习 (ML) 技术在自动化各种临床和研究工作流程方面具有巨大的潜力；然而，训练它们需要广泛的手动标记数据集。消除缩写词对于自动化临床记录处理很重要；然而，由于标记训练数据的稀缺性和不平衡，限制了 ML 对此任务的广泛应用。在这项工作中，我们提出了一种通过利用生物医学本体的相关医学概念形式的信息以及医学记录中的全局上下文信息的新的数据增强技术来提高模型泛化能力的方法。我们在一个公共数据集 (MIMIC III) 上训练我们的模型，并在来自不同来源的自动生成和手动标记数据集 (MIMIC III、CASI、i2b2) 上测试其性能。这些技术结合起来，在手写标记数据上将缩写词消歧的准确性提高了 17%，而不会牺牲对 MIMIC III 中保留测试集的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f72/8423722/06afc8bb085b/41467_2021_25578_Fig1_HTML.jpg

相似文献

Automatically disambiguating medical acronyms with ontology-aware deep learning.基于本体感知深度学习的医学缩略语自动消歧

Nat Commun. 2021 Sep 7;12(1):5319. doi: 10.1038/s41467-021-25578-4.

deepBioWSD: effective deep neural word sense disambiguation of biomedical text data.深度生物词汇语义消歧：生物医学文本数据的有效深度神经网络词汇语义消歧。

J Am Med Inform Assoc. 2019 May 1;26(5):438-446. doi: 10.1093/jamia/ocy189.

Improving clinical abbreviation sense disambiguation using attention-based Bi-LSTM and hybrid balancing techniques in imbalanced datasets.基于注意力机制的 Bi-LSTM 和混合平衡技术在不平衡数据集上提高临床缩写词消歧

J Eval Clin Pract. 2024 Oct;30(7):1327-1336. doi: 10.1111/jep.14041. Epub 2024 Jun 21.

Using MEDLINE as a knowledge source for disambiguating abbreviations and acronyms in full-text biomedical journal articles.使用MEDLINE作为知识来源来消除全文生物医学期刊文章中缩写词和首字母缩略词的歧义。

J Biomed Inform. 2007 Apr;40(2):150-9. doi: 10.1016/j.jbi.2006.06.001. Epub 2006 Jun 7.

Leveraging Large Language Models for Clinical Abbreviation Disambiguation.利用大型语言模型进行临床缩写词消歧。

J Med Syst. 2024 Feb 27;48(1):27. doi: 10.1007/s10916-024-02049-z.

A convolutional route to abbreviation disambiguation in clinical text.一种卷积途径用于临床文本中的缩写歧义消解。

J Biomed Inform. 2018 Oct;86:71-78. doi: 10.1016/j.jbi.2018.07.025. Epub 2018 Aug 15.

Disambiguating Clinical Abbreviations Using a One-Fits-All Classifier Based on Deep Learning Techniques.基于深度学习技术的一刀切分类器在临床缩写中的应用。

Methods Inf Med. 2022 Jun;61(S 01):e28-e34. doi: 10.1055/s-0042-1742388. Epub 2022 Feb 1.

OrganismTagger: detection, normalization and grounding of organism entities in biomedical documents.生物标记器：在生物医学文献中检测、规范和定位生物实体。

Bioinformatics. 2011 Oct 1;27(19):2721-9. doi: 10.1093/bioinformatics/btr452. Epub 2011 Aug 9.

Customization scenarios for de-identification of clinical notes.临床记录去识别的定制化场景。

BMC Med Inform Decis Mak. 2020 Jan 30;20(1):14. doi: 10.1186/s12911-020-1026-2.

Evaluating shallow and deep learning strategies for the 2018 n2c2 shared task on clinical text classification.评估浅层和深度学习策略在 2018 n2c2 临床文本分类共享任务中的应用。

J Am Med Inform Assoc. 2019 Nov 1;26(11):1247-1254. doi: 10.1093/jamia/ocz149.

引用本文的文献

Processing of Short-Form Content in Clinical Narratives: Systematic Scoping Review.临床叙事中短格式内容的处理：系统范围综述。

J Med Internet Res. 2024 Sep 26;26:e57852. doi: 10.2196/57852.

Disambiguation of acronyms in clinical narratives with large language models.利用大型语言模型对临床叙述中的缩略语进行消歧。

J Am Med Inform Assoc. 2024 Sep 1;31(9):2040-2046. doi: 10.1093/jamia/ocae157.

Biomedical text readability after hypernym substitution with fine-tuned large language models.使用微调大语言模型进行上位词替换后的生物医学文本可读性

PLOS Digit Health. 2024 Apr 16;3(4):e0000489. doi: 10.1371/journal.pdig.0000489. eCollection 2024 Apr.

Leveraging Large Language Models for Clinical Abbreviation Disambiguation.利用大型语言模型进行临床缩写词消歧。

J Med Syst. 2024 Feb 27;48(1):27. doi: 10.1007/s10916-024-02049-z.

Standigm ASK™: knowledge graph and artificial intelligence platform applied to target discovery in idiopathic pulmonary fibrosis.Standigm ASK™：应用于特发性肺纤维化靶点发现的知识图谱和人工智能平台。

Brief Bioinform. 2024 Jan 22;25(2). doi: 10.1093/bib/bbae035.

Deciphering clinical abbreviations with a privacy protecting machine learning system.使用具有隐私保护功能的机器学习系统破译临床缩写。

Nat Commun. 2022 Dec 2;13(1):7456. doi: 10.1038/s41467-022-35007-9.

PhenoPad: Building AI enabled note-taking interfaces for patient encounters.PhenoPad：为患者会诊构建支持人工智能的笔记界面。

NPJ Digit Med. 2022 Jan 27;5(1):12. doi: 10.1038/s41746-021-00555-9.

本文引用的文献

Identifying Clinical Terms in Medical Text Using Ontology-Guided Machine Learning.使用本体引导的机器学习识别医学文本中的临床术语。

JMIR Med Inform. 2019 May 10;7(2):e12596. doi: 10.2196/12596.

deepBioWSD: effective deep neural word sense disambiguation of biomedical text data.深度生物词汇语义消歧：生物医学文本数据的有效深度神经网络词汇语义消歧。

J Am Med Inform Assoc. 2019 May 1;26(5):438-446. doi: 10.1093/jamia/ocy189.

A convolutional route to abbreviation disambiguation in clinical text.一种卷积途径用于临床文本中的缩写歧义消解。

J Biomed Inform. 2018 Oct;86:71-78. doi: 10.1016/j.jbi.2018.07.025. Epub 2018 Aug 15.

Towards Comprehensive Clinical Abbreviation Disambiguation Using Machine-Labeled Training Data.利用机器标注训练数据实现临床缩写词的全面消歧

AMIA Annu Symp Proc. 2017 Feb 10;2016:560-569. eCollection 2016.

A long journey to short abbreviations: developing an open-source framework for clinical abbreviation recognition and disambiguation (CARD).从冗长表述到简短缩写的漫长历程：开发一个用于临床缩写识别与消歧的开源框架（CARD）

J Am Med Inform Assoc. 2017 Apr 1;24(e1):e79-e86. doi: 10.1093/jamia/ocw109.

Word Sense Disambiguation of clinical abbreviations with hyperdimensional computing.基于超维计算的临床缩写词词义消歧

AMIA Annu Symp Proc. 2013 Nov 16;2013:1007-16. eCollection 2013.

A sense inventory for clinical abbreviations and acronyms created using clinical notes and medical dictionary resources.使用临床笔记和医学词典资源创建的临床缩写和首字母缩略词感知清单。

J Am Med Inform Assoc. 2014 Mar-Apr;21(2):299-307. doi: 10.1136/amiajnl-2012-001506. Epub 2013 Jun 27.

Evaluating temporal relations in clinical text: 2012 i2b2 Challenge.评估临床文本中的时间关系：2012 i2b2 挑战赛。

J Am Med Inform Assoc. 2013 Sep-Oct;20(5):806-13. doi: 10.1136/amiajnl-2013-001628. Epub 2013 Apr 5.

Automated disambiguation of acronyms and abbreviations in clinical texts: window and training size considerations.临床文本中首字母缩略词和缩写词的自动消歧：窗口与训练规模考量

AMIA Annu Symp Proc. 2012;2012:1310-9. Epub 2012 Nov 3.

Using UMLS lexical resources to disambiguate abbreviations in clinical text.利用统一医学语言系统（UMLS）词汇资源消除临床文本中的缩写歧义。

AMIA Annu Symp Proc. 2011;2011:715-22. Epub 2011 Oct 22.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于本体感知深度学习的医学缩略语自动消歧

Automatically disambiguating medical acronyms with ontology-aware deep learning.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献