Suppr超能文献

利用机器标注训练数据实现临床缩写词的全面消歧

Towards Comprehensive Clinical Abbreviation Disambiguation Using Machine-Labeled Training Data.

作者信息

Finley Gregory P, Pakhomov Serguei V S, McEwan Reed, Melton Genevieve B

机构信息

Institute for Health Informatics; Department of Surgery.

Institute for Health Informatics; College of Pharmacy University of Minnesota, Minneapolis, MN.

出版信息

AMIA Annu Symp Proc. 2017 Feb 10;2016:560-569. eCollection 2016.

Abstract

Abbreviation disambiguation in clinical texts is a problem handled well by fully supervised machine learning methods. Acquiring training data, however, is expensive and would be impractical for large numbers of abbreviations in specialized corpora. An alternative is a semi-supervised approach, in which training data are automatically generated by substituting long forms in natural text with their corresponding abbreviations. Most prior implementations of this method either focus on very few abbreviations or do not test on real-world data. We present a realistic use case by testing several semi-supervised classification algorithms on a large hand-annotated medical record of occurrences of 74 ambiguous abbreviations. Despite notable differences between training and test corpora, classifiers achieve up to 90% accuracy. Our tests demonstrate that semi-supervised abbreviation disambiguation is a viable and extensible option for medical NLP systems.

摘要

临床文本中的缩写消除歧义问题可通过完全监督的机器学习方法得到很好的处理。然而,获取训练数据成本高昂,对于专业语料库中的大量缩写来说是不切实际的。一种替代方法是半监督方法,其中训练数据通过用自然文本中的长形式替换其相应缩写自动生成。该方法以前的大多数实现要么只关注极少数缩写,要么没有在真实数据上进行测试。我们通过在一份包含74个歧义缩写出现情况的大型人工标注医疗记录上测试几种半监督分类算法,展示了一个实际应用案例。尽管训练语料库和测试语料库之间存在显著差异,但分类器的准确率高达90%。我们的测试表明,半监督缩写消除歧义对于医学自然语言处理系统来说是一个可行且可扩展的选择。

相似文献

1
Towards Comprehensive Clinical Abbreviation Disambiguation Using Machine-Labeled Training Data.
AMIA Annu Symp Proc. 2017 Feb 10;2016:560-569. eCollection 2016.
3
A Preliminary Study of Clinical Abbreviation Disambiguation in Real Time.
Appl Clin Inform. 2015 Jun 3;6(2):364-74. doi: 10.4338/ACI-2014-10-RA-0088. eCollection 2015.
6
Using UMLS lexical resources to disambiguate abbreviations in clinical text.
AMIA Annu Symp Proc. 2011;2011:715-22. Epub 2011 Oct 22.
7
Word Sense Disambiguation of clinical abbreviations with hyperdimensional computing.
AMIA Annu Symp Proc. 2013 Nov 16;2013:1007-16. eCollection 2013.
9
A multi-aspect comparison study of supervised word sense disambiguation.
J Am Med Inform Assoc. 2004 Jul-Aug;11(4):320-31. doi: 10.1197/jamia.M1533. Epub 2004 Apr 2.
10
An easily implemented method for abbreviation expansion for the medical domain in Japanese text. A preliminary study.
Methods Inf Med. 2013;52(1):51-61. doi: 10.3414/ME12-01-0040. Epub 2012 Dec 7.

引用本文的文献

2
Processing of Short-Form Content in Clinical Narratives: Systematic Scoping Review.
J Med Internet Res. 2024 Sep 26;26:e57852. doi: 10.2196/57852.
3
Clinical Note Structural Knowledge Improves Word Sense Disambiguation.
AMIA Jt Summits Transl Sci Proc. 2024 May 31;2024:515-524. eCollection 2024.
4
Deciphering clinical abbreviations with a privacy protecting machine learning system.
Nat Commun. 2022 Dec 2;13(1):7456. doi: 10.1038/s41467-022-35007-9.
5
Disambiguating Clinical Abbreviations Using a One-Fits-All Classifier Based on Deep Learning Techniques.
Methods Inf Med. 2022 Jun;61(S 01):e28-e34. doi: 10.1055/s-0042-1742388. Epub 2022 Feb 1.
7
Automatically disambiguating medical acronyms with ontology-aware deep learning.
Nat Commun. 2021 Sep 7;12(1):5319. doi: 10.1038/s41467-021-25578-4.
8
What's in a Summary? Laying the Groundwork for Advances in Hospital-Course Summarization.
Proc Conf. 2021 Jun;2021:4794-4811. doi: 10.18653/v1/2021.naacl-main.382.
9
Clinical Text Data in Machine Learning: Systematic Review.
JMIR Med Inform. 2020 Mar 31;8(3):e17984. doi: 10.2196/17984.
10
Learning unsupervised contextual representations for medical synonym discovery.
JAMIA Open. 2019 Nov 4;2(4):538-546. doi: 10.1093/jamiaopen/ooz057. eCollection 2019 Dec.

本文引用的文献

1
Challenges and practical approaches with word sense disambiguation of acronyms and abbreviations in the clinical domain.
Healthc Inform Res. 2015 Jan;21(1):35-42. doi: 10.4258/hir.2015.21.1.35. Epub 2015 Jan 31.
2
Word Sense Disambiguation of clinical abbreviations with hyperdimensional computing.
AMIA Annu Symp Proc. 2013 Nov 16;2013:1007-16. eCollection 2013.
3
Synonym extraction and abbreviation expansion with ensembles of semantic spaces.
J Biomed Semantics. 2014 Feb 5;5(1):6. doi: 10.1186/2041-1480-5-6.
4
A sense inventory for clinical abbreviations and acronyms created using clinical notes and medical dictionary resources.
J Am Med Inform Assoc. 2014 Mar-Apr;21(2):299-307. doi: 10.1136/amiajnl-2012-001506. Epub 2013 Jun 27.
6
Hyperdimensional computing approach to word sense disambiguation.
AMIA Annu Symp Proc. 2012;2012:1129-38. Epub 2012 Nov 3.
9
Methods for building sense inventories of abbreviations in clinical notes.
J Am Med Inform Assoc. 2009 Jan-Feb;16(1):103-8. doi: 10.1197/jamia.M2927. Epub 2008 Oct 24.
10
Medical abbreviations: writing little and communicating less.
Arch Dis Child. 2008 Oct;93(10):816-7. doi: 10.1136/adc.2008.141473.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验