构建临床记录中缩写词语义清单的方法。

Methods for building sense inventories of abbreviations in clinical notes.

作者信息

Xu Hua, Stetson Peter D, Friedman Carol

机构信息

Department of Biomedical Informatics, Columbia University, New York, NY, USA.

出版信息

J Am Med Inform Assoc. 2009 Jan-Feb;16(1):103-8. doi: 10.1197/jamia.M2927. Epub 2008 Oct 24.

DOI:10.1197/jamia.M2927

PMID:18952935

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2605589/

Abstract

OBJECTIVE

To develop methods for building corpus-specific sense inventories of abbreviations occurring in clinical documents.

DESIGN

A corpus of internal medicine admission notes was collected and instances of each clinical abbreviation in the corpus were clustered to different sense clusters. One instance from each cluster was manually annotated to generate a final list of senses. Two clustering-based methods (Expectation Maximization--EM and Farthest First--FF) and one random sampling method for sense detection were evaluated using a set of 12 clinical abbreviations.

MEASUREMENTS

The clustering-based sense detection methods were evaluated using a set of clinical abbreviations that were manually sense annotated. "Sense Completeness" and "Annotation Cost" were used to measure the performance of different methods. Clustering error rates were also reported for different clustering algorithms.

RESULTS

A clustering-based semi-automated method was developed to build corpus-specific sense inventories for abbreviations in hospital admission notes. Evaluation demonstrated that this method could largely reduce manual annotation cost and increase the completeness of sense inventories when compared with a manual annotation method using random samples.

CONCLUSION

The authors developed an effective clustering-based method for building corpus-specific sense inventories for abbreviations in a clinical corpus. To the best of the authors knowledge, this is the first time clustering technologies have been used to help building sense inventories of abbreviations in clinical text. The results demonstrated that the clustering-based method performed better than the manual annotation method using random samples for the task of building sense inventories of clinical abbreviations.

摘要

目的

开发用于构建临床文档中出现的缩写词的特定语料库词义清单的方法。

设计

收集了一组内科入院记录语料库，并将语料库中每个临床缩写词的实例聚类到不同的词义簇中。从每个簇中手动标注一个实例，以生成最终的词义列表。使用一组12个临床缩写词对两种基于聚类的方法（期望最大化算法——EM和最远优先算法——FF）以及一种用于词义检测的随机抽样方法进行了评估。

测量指标

使用一组经过人工词义标注的临床缩写词对基于聚类的词义检测方法进行评估。“词义完整性”和“标注成本”用于衡量不同方法的性能。还报告了不同聚类算法的聚类错误率。

结果

开发了一种基于聚类的半自动方法，用于构建医院入院记录中缩写词的特定语料库词义清单。评估表明，与使用随机样本的人工标注方法相比，该方法可以大大降低人工标注成本，并提高词义清单的完整性。

结论

作者开发了一种有效的基于聚类的方法，用于构建临床语料库中缩写词的特定语料库词义清单。据作者所知，这是首次使用聚类技术来帮助构建临床文本中缩写词的词义清单。结果表明，在构建临床缩写词词义清单的任务中，基于聚类的方法比使用随机样本的人工标注方法表现更好。

相似文献

Methods for building sense inventories of abbreviations in clinical notes.构建临床记录中缩写词语义清单的方法。

J Am Med Inform Assoc. 2009 Jan-Feb;16(1):103-8. doi: 10.1197/jamia.M2927. Epub 2008 Oct 24.

A new clustering method for detecting rare senses of abbreviations in clinical notes.一种用于在临床记录中检测罕见缩写词用法的新聚类方法。

J Biomed Inform. 2012 Dec;45(6):1075-83. doi: 10.1016/j.jbi.2012.06.003. Epub 2012 Jun 25.

A long journey to short abbreviations: developing an open-source framework for clinical abbreviation recognition and disambiguation (CARD).从冗长表述到简短缩写的漫长历程：开发一个用于临床缩写识别与消歧的开源框架（CARD）

J Am Med Inform Assoc. 2017 Apr 1;24(e1):e79-e86. doi: 10.1093/jamia/ocw109.

A study of abbreviations in clinical notes.临床记录中缩写的研究。

AMIA Annu Symp Proc. 2007 Oct 11;2007:821-5.

A sense inventory for clinical abbreviations and acronyms created using clinical notes and medical dictionary resources.使用临床笔记和医学词典资源创建的临床缩写和首字母缩略词感知清单。

J Am Med Inform Assoc. 2014 Mar-Apr;21(2):299-307. doi: 10.1136/amiajnl-2012-001506. Epub 2013 Jun 27.

Combining corpus-derived sense profiles with estimated frequency information to disambiguate clinical abbreviations.结合源自语料库的词义概况与估计的频率信息来消除临床缩写的歧义。

AMIA Annu Symp Proc. 2012;2012:1004-13. Epub 2012 Nov 3.

A Preliminary Study of Clinical Abbreviation Disambiguation in Real Time.实时临床缩写词消歧的初步研究

Appl Clin Inform. 2015 Jun 3;6(2):364-74. doi: 10.4338/ACI-2014-10-RA-0088. eCollection 2015.

Abbreviation and acronym disambiguation in clinical discourse.临床语篇中的缩写词和首字母缩略词消歧

AMIA Annu Symp Proc. 2005;2005:589-93.

Methods for building sense inventories of abbreviations in clinical notes.构建临床笔记中缩写词语义清单的方法。

AMIA Annu Symp Proc. 2008 Nov 6;2008:819.

A method for harmonization of clinical abbreviation and acronym sense inventories.一种协调临床缩写和首字母缩略词意义清单的方法。

J Biomed Inform. 2018 Dec;88:62-69. doi: 10.1016/j.jbi.2018.11.004. Epub 2018 Nov 7.

引用本文的文献

Artificial Intelligence Assesses Clinicians' Adherence to Asthma Guidelines Using Electronic Health Records.人工智能使用电子健康记录评估临床医生对哮喘指南的遵循情况。

J Allergy Clin Immunol Pract. 2022 Apr;10(4):1047-1056.e1. doi: 10.1016/j.jaip.2021.11.004. Epub 2021 Nov 17.

A deep database of medical abbreviations and acronyms for natural language processing.用于自然语言处理的医学缩写和首字母缩略词的深度数据库。

Sci Data. 2021 Jun 2;8(1):149. doi: 10.1038/s41597-021-00929-4.

Exploring semantic deep learning for building reliable and reusable one health knowledge from PubMed systematic reviews and veterinary clinical notes.探索语义深度学习，以便从PubMed系统评价和兽医临床记录中构建可靠且可重复使用的一体化健康知识。

J Biomed Semantics. 2019 Nov 12;10(Suppl 1):22. doi: 10.1186/s13326-019-0212-6.

A method for harmonization of clinical abbreviation and acronym sense inventories.一种协调临床缩写和首字母缩略词意义清单的方法。

J Biomed Inform. 2018 Dec;88:62-69. doi: 10.1016/j.jbi.2018.11.004. Epub 2018 Nov 7.

Distinction between medical and non-medical usages of short forms in clinical narratives.临床记录中缩写词医学用法与非医学用法的区分。

AMIA Annu Symp Proc. 2018 Apr 16;2017:1302-1311. eCollection 2017.

Clinical documentation variations and NLP system portability: a case study in asthma birth cohorts across institutions.临床文档差异与自然语言处理系统的可移植性：跨机构哮喘出生队列的案例研究

J Am Med Inform Assoc. 2018 Mar 1;25(3):353-359. doi: 10.1093/jamia/ocx138.

Differentiating Sense through Semantic Interaction Data.通过语义交互数据区分意义。

AMIA Annu Symp Proc. 2017 Feb 10;2016:1238-1247. eCollection 2016.

Towards Comprehensive Clinical Abbreviation Disambiguation Using Machine-Labeled Training Data.利用机器标注训练数据实现临床缩写词的全面消歧

AMIA Annu Symp Proc. 2017 Feb 10;2016:560-569. eCollection 2016.

J Am Med Inform Assoc. 2017 Apr 1;24(e1):e79-e86. doi: 10.1093/jamia/ocw109.

Challenges and practical approaches with word sense disambiguation of acronyms and abbreviations in the clinical domain.临床领域中首字母缩略词和缩写词词义消歧的挑战与实用方法。

Healthc Inform Res. 2015 Jan;21(1):35-42. doi: 10.4258/hir.2015.21.1.35. Epub 2015 Jan 31.

本文引用的文献

A study of abbreviations in clinical notes.临床记录中缩写的研究。

AMIA Annu Symp Proc. 2007 Oct 11;2007:821-5.

ADAM: another database of abbreviations in MEDLINE.ADAM：医学在线数据库（MEDLINE）中的另一个缩写词数据库。

Bioinformatics. 2006 Nov 15;22(22):2813-8. doi: 10.1093/bioinformatics/btl480. Epub 2006 Sep 18.

Electronic discharge summaries.电子出院小结。

AMIA Annu Symp Proc. 2005;2005:1121.

A multi-aspect comparison study of supervised word sense disambiguation.监督式词义消歧的多方面比较研究

J Am Med Inform Assoc. 2004 Jul-Aug;11(4):320-31. doi: 10.1197/jamia.M1533. Epub 2004 Apr 2.

SaRAD: a Simple and Robust Abbreviation Dictionary.SaRAD：一个简单且强大的缩写词典。

Bioinformatics. 2004 Mar 1;20(4):527-33. doi: 10.1093/bioinformatics/btg439. Epub 2004 Jan 22.

Parsing free text nursing notes.解析自由文本护理记录。

AMIA Annu Symp Proc. 2003;2003:917.

The sublanguage of cross-coverage.交叉覆盖的子语言

Proc AMIA Symp. 2002:742-6.

Creating an online dictionary of abbreviations from MEDLINE.创建一个来自医学文献数据库（MEDLINE）的缩写在线词典。

J Am Med Inform Assoc. 2002 Nov-Dec;9(6):612-20. doi: 10.1197/jamia.m1139.

A study of abbreviations in the UMLS.一项关于统一医学语言系统（UMLS）中缩写词的研究。

Proc AMIA Symp. 2001:393-7.

Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program.生物医学文本到UMLS元词表的有效映射：MetaMap程序

Proc AMIA Symp. 2001:17-21.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验