• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用机器标注训练数据实现临床缩写词的全面消歧

Towards Comprehensive Clinical Abbreviation Disambiguation Using Machine-Labeled Training Data.

作者信息

Finley Gregory P, Pakhomov Serguei V S, McEwan Reed, Melton Genevieve B

机构信息

Institute for Health Informatics; Department of Surgery.

Institute for Health Informatics; College of Pharmacy University of Minnesota, Minneapolis, MN.

出版信息

AMIA Annu Symp Proc. 2017 Feb 10;2016:560-569. eCollection 2016.

PMID:28269852
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5333249/
Abstract

Abbreviation disambiguation in clinical texts is a problem handled well by fully supervised machine learning methods. Acquiring training data, however, is expensive and would be impractical for large numbers of abbreviations in specialized corpora. An alternative is a semi-supervised approach, in which training data are automatically generated by substituting long forms in natural text with their corresponding abbreviations. Most prior implementations of this method either focus on very few abbreviations or do not test on real-world data. We present a realistic use case by testing several semi-supervised classification algorithms on a large hand-annotated medical record of occurrences of 74 ambiguous abbreviations. Despite notable differences between training and test corpora, classifiers achieve up to 90% accuracy. Our tests demonstrate that semi-supervised abbreviation disambiguation is a viable and extensible option for medical NLP systems.

摘要

临床文本中的缩写消除歧义问题可通过完全监督的机器学习方法得到很好的处理。然而,获取训练数据成本高昂,对于专业语料库中的大量缩写来说是不切实际的。一种替代方法是半监督方法,其中训练数据通过用自然文本中的长形式替换其相应缩写自动生成。该方法以前的大多数实现要么只关注极少数缩写,要么没有在真实数据上进行测试。我们通过在一份包含74个歧义缩写出现情况的大型人工标注医疗记录上测试几种半监督分类算法,展示了一个实际应用案例。尽管训练语料库和测试语料库之间存在显著差异,但分类器的准确率高达90%。我们的测试表明,半监督缩写消除歧义对于医学自然语言处理系统来说是一个可行且可扩展的选择。

相似文献

1
Towards Comprehensive Clinical Abbreviation Disambiguation Using Machine-Labeled Training Data.利用机器标注训练数据实现临床缩写词的全面消歧
AMIA Annu Symp Proc. 2017 Feb 10;2016:560-569. eCollection 2016.
2
Automated disambiguation of acronyms and abbreviations in clinical texts: window and training size considerations.临床文本中首字母缩略词和缩写词的自动消歧:窗口与训练规模考量
AMIA Annu Symp Proc. 2012;2012:1310-9. Epub 2012 Nov 3.
3
A Preliminary Study of Clinical Abbreviation Disambiguation in Real Time.实时临床缩写词消歧的初步研究
Appl Clin Inform. 2015 Jun 3;6(2):364-74. doi: 10.4338/ACI-2014-10-RA-0088. eCollection 2015.
4
Disambiguating Clinical Abbreviations by One-to-All Classification: Algorithm Development and Validation Study.通过一对一分类法对临床缩写进行消歧:算法开发和验证研究。
JMIR Med Inform. 2024 Oct 1;12:e56955. doi: 10.2196/56955.
5
A long journey to short abbreviations: developing an open-source framework for clinical abbreviation recognition and disambiguation (CARD).从冗长表述到简短缩写的漫长历程:开发一个用于临床缩写识别与消歧的开源框架(CARD)
J Am Med Inform Assoc. 2017 Apr 1;24(e1):e79-e86. doi: 10.1093/jamia/ocw109.
6
Using UMLS lexical resources to disambiguate abbreviations in clinical text.利用统一医学语言系统(UMLS)词汇资源消除临床文本中的缩写歧义。
AMIA Annu Symp Proc. 2011;2011:715-22. Epub 2011 Oct 22.
7
Word Sense Disambiguation of clinical abbreviations with hyperdimensional computing.基于超维计算的临床缩写词词义消歧
AMIA Annu Symp Proc. 2013 Nov 16;2013:1007-16. eCollection 2013.
8
Combining corpus-derived sense profiles with estimated frequency information to disambiguate clinical abbreviations.结合源自语料库的词义概况与估计的频率信息来消除临床缩写的歧义。
AMIA Annu Symp Proc. 2012;2012:1004-13. Epub 2012 Nov 3.
9
A multi-aspect comparison study of supervised word sense disambiguation.监督式词义消歧的多方面比较研究
J Am Med Inform Assoc. 2004 Jul-Aug;11(4):320-31. doi: 10.1197/jamia.M1533. Epub 2004 Apr 2.
10
An easily implemented method for abbreviation expansion for the medical domain in Japanese text. A preliminary study.一种用于日语医学文本领域缩写扩展的易于实现的方法。一项初步研究。
Methods Inf Med. 2013;52(1):51-61. doi: 10.3414/ME12-01-0040. Epub 2012 Dec 7.

引用本文的文献

1
Disambiguating Clinical Abbreviations by One-to-All Classification: Algorithm Development and Validation Study.通过一对一分类法对临床缩写进行消歧:算法开发和验证研究。
JMIR Med Inform. 2024 Oct 1;12:e56955. doi: 10.2196/56955.
2
Processing of Short-Form Content in Clinical Narratives: Systematic Scoping Review.临床叙事中短格式内容的处理:系统范围综述。
J Med Internet Res. 2024 Sep 26;26:e57852. doi: 10.2196/57852.
3
Clinical Note Structural Knowledge Improves Word Sense Disambiguation.临床笔记结构知识可改善词义消歧。
AMIA Jt Summits Transl Sci Proc. 2024 May 31;2024:515-524. eCollection 2024.
4
Deciphering clinical abbreviations with a privacy protecting machine learning system.使用具有隐私保护功能的机器学习系统破译临床缩写。
Nat Commun. 2022 Dec 2;13(1):7456. doi: 10.1038/s41467-022-35007-9.
5
Disambiguating Clinical Abbreviations Using a One-Fits-All Classifier Based on Deep Learning Techniques.基于深度学习技术的一刀切分类器在临床缩写中的应用。
Methods Inf Med. 2022 Jun;61(S 01):e28-e34. doi: 10.1055/s-0042-1742388. Epub 2022 Feb 1.
6
Zero-Shot Clinical Acronym Expansion via Latent Meaning Cells.通过潜在意义细胞实现零样本临床首字母缩略词扩展
Proc Mach Learn Res. 2020 Dec;136:12-40.
7
Automatically disambiguating medical acronyms with ontology-aware deep learning.基于本体感知深度学习的医学缩略语自动消歧
Nat Commun. 2021 Sep 7;12(1):5319. doi: 10.1038/s41467-021-25578-4.
8
What's in a Summary? Laying the Groundwork for Advances in Hospital-Course Summarization.摘要中有什么?为住院病程总结的进展奠定基础。
Proc Conf. 2021 Jun;2021:4794-4811. doi: 10.18653/v1/2021.naacl-main.382.
9
Clinical Text Data in Machine Learning: Systematic Review.机器学习中的临床文本数据:系统综述
JMIR Med Inform. 2020 Mar 31;8(3):e17984. doi: 10.2196/17984.
10
Learning unsupervised contextual representations for medical synonym discovery.学习用于医学同义词发现的无监督上下文表示。
JAMIA Open. 2019 Nov 4;2(4):538-546. doi: 10.1093/jamiaopen/ooz057. eCollection 2019 Dec.

本文引用的文献

1
Challenges and practical approaches with word sense disambiguation of acronyms and abbreviations in the clinical domain.临床领域中首字母缩略词和缩写词词义消歧的挑战与实用方法。
Healthc Inform Res. 2015 Jan;21(1):35-42. doi: 10.4258/hir.2015.21.1.35. Epub 2015 Jan 31.
2
Word Sense Disambiguation of clinical abbreviations with hyperdimensional computing.基于超维计算的临床缩写词词义消歧
AMIA Annu Symp Proc. 2013 Nov 16;2013:1007-16. eCollection 2013.
3
Synonym extraction and abbreviation expansion with ensembles of semantic spaces.使用语义空间集合进行同义词提取和缩写扩展。
J Biomed Semantics. 2014 Feb 5;5(1):6. doi: 10.1186/2041-1480-5-6.
4
A sense inventory for clinical abbreviations and acronyms created using clinical notes and medical dictionary resources.使用临床笔记和医学词典资源创建的临床缩写和首字母缩略词感知清单。
J Am Med Inform Assoc. 2014 Mar-Apr;21(2):299-307. doi: 10.1136/amiajnl-2012-001506. Epub 2013 Jun 27.
5
Automated disambiguation of acronyms and abbreviations in clinical texts: window and training size considerations.临床文本中首字母缩略词和缩写词的自动消歧:窗口与训练规模考量
AMIA Annu Symp Proc. 2012;2012:1310-9. Epub 2012 Nov 3.
6
Hyperdimensional computing approach to word sense disambiguation.用于词义消歧的超维计算方法。
AMIA Annu Symp Proc. 2012;2012:1129-38. Epub 2012 Nov 3.
7
Combining corpus-derived sense profiles with estimated frequency information to disambiguate clinical abbreviations.结合源自语料库的词义概况与估计的频率信息来消除临床缩写的歧义。
AMIA Annu Symp Proc. 2012;2012:1004-13. Epub 2012 Nov 3.
8
A comparative study of current Clinical Natural Language Processing systems on handling abbreviations in discharge summaries.当前临床自然语言处理系统在处理出院小结中缩写词方面的比较研究。
AMIA Annu Symp Proc. 2012;2012:997-1003. Epub 2012 Nov 3.
9
Methods for building sense inventories of abbreviations in clinical notes.构建临床记录中缩写词语义清单的方法。
J Am Med Inform Assoc. 2009 Jan-Feb;16(1):103-8. doi: 10.1197/jamia.M2927. Epub 2008 Oct 24.
10
Medical abbreviations: writing little and communicating less.医学缩写:写得简短,表意更少。
Arch Dis Child. 2008 Oct;93(10):816-7. doi: 10.1136/adc.2008.141473.